The Test Environment Manager (TEM) plays a pivotal role in transforming the Software Development Lifecycle (SDLC) environment. This role requires a strong engineering mindset with a focus on system reliability, automation, and performance in non-production environments. The TEM will lead the design, automation, monitoring, and continuous improvement of test environments to support development, testing, and delivery teams.
Key Responsibilities
Operational Responsibilities
Automate Environment Lifecycle:
Develop Infrastructure as Code (IaC) to provision, configure, and decommission test environments, integrating with CI/CD pipelines.
Define Service Objectives:
Establish and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs), such as availability and provisioning time, to ensure environments meet team needs.
Monitor Health & Performance:
Implement observability practices using tools like Prometheus and Grafana to proactively identify and resolve performance bottlenecks.
Incident Management:
Lead incident response for environment-related issues, conducting blameless post-mortems and implementing sustainable solutions.
Reduce Toil:
Identify and automate repetitive manual tasks to improve efficiency and free up engineering capacity.
Strategic & Cultural Responsibilities
Drive Continuous Improvement:
Analyze metrics, incidents, and reports to identify opportunities for improvement and innovation.
Balance Reliability & Speed:
Apply error budget principles to manage trade-offs between reliability and delivery speed.
Promote Reliability Culture:
Foster a culture of shared ownership and blameless incident response across development, QA, and SRE teams.
Capacity Planning:
Forecast and plan for future infrastructure needs based on usage patterns and demand.
Advance Test Data Management:
Collaborate with Test Data Managers to ensure test data availability, compliance, consistency, and automated provisioning.
Technical Skills
Strong proficiency in observability, monitoring, and logging tools (e.g., Prometheus, Splunk, Grafana).
Expertise in CI/CD platforms (e.g., Jenkins, GitLab CI) and configuration management tools (e.g., Ansible, Terraform).
In-depth knowledge of cloud platforms (e.g., AWS) and containerization technologies (Docker, Kubernetes), as well as serverless architectures.
Advanced scripting skills in Python, Bash, or similar languages to automate environment management tasks.
Solid foundation in Linux systems, networking concepts, and database management.
Soft Skills
Leadership & Influence:
Ability to drive adoption of SRE practices and influence stakeholders across technical and business functions.
Problem-Solving:
Strong analytical and debugging skills to resolve complex environment issues under pressure.
Communication:
Excellent verbal and written communication skills to collaborate effectively across teams.
Adaptability:
Proactive, flexible mindset to adapt to evolving technologies and development practices.
Job Type: Full-time
Pay: 70,000.00-80,000.00 per year
Work Location: Hybrid remote in London EC1A 1AA
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.