The Test Environment Manager (TEM) is a senior, engineering-focused role responsible for transforming and managing the entire non-production environment landscape. This role ensures that test environments are reliable, scalable, automated, observable, and aligned with modern SDLC and DevOps practices. The TEM drives technical excellence, SRE-inspired culture, and continuous improvement across development, QA, and operations teams.
Design and implement Infrastructure as Code (IaC) to fully automate provisioning, configuration, and teardown of test environments.
Integrate environment automation seamlessly into CI/CD pipelines to enable on-demand, self-service environment delivery.
Reliability & Observability
Define and maintain Service Level Objectives (SLOs) and key Service Level Indicators (SLIs), such as environment availability, provisioning time, and stability metrics.
Monitor environment health using observability tools (Prometheus, Grafana, Splunk, etc.) and proactively identify and resolve performance issues or bottlenecks.
Incident & Problem Management
Lead incident response for environment-related issues, driving quick resolution and facilitating blameless post-mortems.
Implement permanent fixes based on root cause analysis and reduce repeat incidents.
Automation & Toil Reduction
Identify repetitive, manual environment tasks and eliminate them through automation, improving engineering efficiency and reducing operational burden.
Strategic & Cultural ResponsibilitiesContinuous Improvement
Analyze environment performance data, incident trends, and post-mortem outcomes to drive ongoing enhancements and innovation.
Reliability Management
Apply an "error budget" framework to balance velocity and reliability across teams.
Shift priorities between stability improvements and feature delivery based on reliability KPIs.
Culture & Collaboration
Promote a culture of shared ownership, blameless problem-solving, and strong cross-team collaboration among development, QA, DevOps, and SRE teams.
Capacity Planning & Scalability
Forecast environment capacity requirements based on usage trends, test cycles, and upcoming projects.
Ensure infrastructure elasticity and scalability to meet future demands.
Test Data Management Integration
Partner with Test Data Management teams to ensure test data is consistent, compliant, refreshed automatically, and aligned with environment provisioning needs.
Technical Skills & Experience
Monitoring & Observability:
Expertise with Prometheus, Grafana, Splunk, ELK/EFK, or similar platforms.
CI/CD & Automation Tools:
Strong experience with Jenkins, GitLab CI, GitHub Actions, and configuration management tools (Terraform, Ansible, etc.).
Cloud & Container Platforms:
Deep understanding of cloud infrastructure (AWS preferred), Kubernetes, Docker, and serverless technologies.
Scripting & Programming:
Proficiency in Python, Bash, or similar scripting languages for automation and environment tooling.
Systems & Networking:
Strong knowledge of Linux systems, networking concepts, DNS, load balancing, and database operations.
Soft Skills & Leadership Qualities
Leadership & Influence:
Ability to drive SRE and environment best practices across multiple teams and technical domains.
Analytical Problem-Solving:
Strong debugging, troubleshooting, and decision-making skills under time-sensitive conditions.
Communication Excellence:
Clear and effective communication with technical and non-technical stakeholders.
Adaptability & Proactiveness:
Ability to stay ahead of evolving technologies, tools, and environment architectures.
Summary
This role is ideal for a seasoned engineering leader who brings strong technical depth, SRE mindset, automation-first thinking, and the ability to shape and modernize complex test environment landscapes.
Job Type: Fixed term contract
Contract length: 12 months
Pay: 85,000.00-90,000.00 per year
Benefits:
Sabbatical
Sick pay
Work Location: Hybrid remote in London EC1A
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.