Hybrid with a requirement to be in the office on average once a week
Harwell Campus, Near Didcot, Oxfordshire, UK
Job Brief
This is a critical role to support the design, implementation, and maintenance of our AWS-based infrastructure and internal IT systems. This role blends technical execution, stakeholder engagement, and operational leadership.
You'll work closely with the Head of Infrastructure and Security and will be empowered to lead initiatives, cover leadership responsibilities in their absence, and act as a trusted technical liaison with stakeholders across the business.
The role is a technical liaison between development teams and the Infrastructure and Security team focusing on understanding technical requirements and the implementation of key systems and processes used to underpin the delivery of our software services
Key Responsibilities
Operational Execution
Design, build, and maintain secure, scalable AWS infrastructure using best practices
Implement and manage Infrastructure as Code using Terraform
Manage CI/CD pipelines, deployment automation, and observability tooling
Participate in on-call rotation and incident response, leading investigations and resolutions as needed
Assist with the deployment, scaling, and management of containerized applications in Kubernetes clusters
Troubleshoot issues within Kubernetes environments, including pod failures, networking, and storage problems
Patching and maintenance of Kubernetes clusters
Monitoring and alerting using tools like Prometheus, OpenTelemetry and Grafana
Support internal IT functions including endpoint provisioning, device management (MDM), and access controls
Incident Response & Troubleshooting:
Actively participate in incident response for cloud infrastructure and mobile device-related issues, ensuring timely resolution and minimizing system downtime
Collaborate with senior engineers to investigate root causes of incidents and implement preventive measures
Implementation and support of a centralised logging solution for troubleshooting and incident resolution
Security Operations
Maintain and improve security posture across cloud and internal systems
Implement IAM policies, encryption standards, vulnerability management, and monitoring
Meet compliance requirements through documentation and operational controls
Support DevSecOps initiatives and integrate security controls into CI/CD pipelines
IT Operations & Endpoint Management
Oversee the provisioning, configuration, and lifecycle management of employee devices (laptops, mobile devices, etc.)
Implement and manage endpoint protection, MDM (Mobile Device Management), and patching strategies
Ensure secure access to corporate tools and systems, with appropriate controls in place
Manage user identity and access across systems (e.g., SSO, MFA, directory services)
Stakeholder Engagement & Technical Discovery
A key point of contact for cross-functional stakeholders to gather, clarify, understand and translate technical infrastructure and security requirements
Planning and scoping of infrastructure improvements or migrations based on stakeholder feedback
Leadership
Mentor junior team members and contribute to knowledge-sharing and process documentation
Provide leadership cover for the Head of Infrastructure and Security as required, including participation in planning meetings, decision-making, and communication with leadership
Provide operational support for the Head of Infrastructure and Security when they are unavailable, ensuring continuity of engineering operations and technical support
Working (& Desirable) Technology Stack
Experience in cloud infrastructure, DevOps, or platform engineering with security responsibilities
Experience of using the AWS Well Architected Framework to implement solutions
Understanding of Prometheus, Opensearch and Grafana for logging and alerting
Containerisation Orchestration with Docker and Kubernetes
Hands-on experience with Infrastructure as Code (Terraform)
Understanding of architectural principles of building with Cloud Native Technologies
Knowledge of security frameworks
Understanding of ITIL frameworks and incident management
The following technology experience is desired but not required
PostgreSQL
Python (particularly boto3) and bash for automation
Personal Skills and Experience
Relevant professional level qualification or experience
Experience as a technically involved DevOps Engineer
Analytical, organised, and effective approach to setting priorities
Positive and approachable demeanour
Proactive approach to problem solving
Experience building highly automated infrastructures
Awareness of DevOps and Agile principles
Proven ability to cascade information and coach users in Cloud Architecture
Candidate will be expected to participate in the Architecture Guild
Candidates should be able to demonstrate good levels of
Problem-solving
How a logging solutions can be used for alerting and incident resolution
Teamwork
Composure under pressure
Written communication
Verbal communication
* Mentoring more junior members of staff
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.