We are seeking a Senior DevOps Engineer with a deep understanding of AWS cloud infrastructure, Infrastructure-as-Code (IaC) tooling such as Terraform, and configuration management using Ansible. The ideal candidate will be a self-starter, passionate about Site Reliability Engineering (SRE) principles, and thrive in collaborative environments.
You will play a pivotal role in automating infrastructure, improving reliability and scalability, and ensuring smooth CI/CD pipelines across multiple environments. You'll work closely with software engineering, and security teams to drive platform excellence.
Key Responsibilities
Infrastructure & Automation
Design, build, and manage scalable, secure, and resilient infrastructure on AWS using Terraform (modularized, reusable components).
Implement configuration management solutions using Ansible, including playbook development, inventory structuring, and role-based automation.
Manage secrets securely using services such as AWS Secrets Manager or HashiCorp Vault.
SRE & Reliability
Implement robust monitoring, alerting, and observability tooling (e.g., CloudWatch, Prometheus, Grafana, Datadog).
Participate in incident response, root cause analysis, and resilience improvements.
CI/CD & Platform Engineering
Maintain and evolve CI/CD pipelines using tools such as GitHub Actions, Bitbucket Pipelines, or Jenkins.
Automate deployments for container-based workloads on ECS (Fargate), or Lambda, and manage supporting infrastructure.
Collaborate with development teams to optimize build/deploy cycles and reduce lead time for changes.
Security & Compliance
Ensure security best practices are embedded into infrastructure provisioning and pipeline execution.
Support compliance and auditing by implementing guardrails and controls as code (e.g., AWS Config, SCPs, IAM policy management).
Required Skills and Experience
Technical Expertise
5+ years in DevOps, SRE, or Cloud Engineering roles.
Expertise in AWS core services: EC2, IAM, VPC, ECS/Fargate, CloudFormation, CloudWatch, RDS, DynamoDB, S3, Lambda.
Strong proficiency in Terraform (HCL) - including workspaces, modules, and Terraform Cloud or similar.
Ansible experience - developing roles, dynamic inventories, managing remote configurations.
Strong scripting knowledge (Bash, Python, or Go).
Experience with container orchestration and deployment (Docker, ECS, or Kubernetes).
Proficient with GitOps or IaC-based workflows.
SRE Mindset
Familiarity with Google SRE practices, particularly around reliability, observability, and operational excellence.
Understanding of systems reliability metrics and associated tooling.
Soft Skills & Behaviours
Self-driven with a bias toward action and ownership.
Excellent communicator, able to collaborate across disciplines and levels of technical understanding.
Experience working as part of a cross-functional team.
Comfortable working in agile environments (Scrum/Kanban).
Preferred Qualifications
AWS certification (e.g., Solutions Architect - Professional, DevOps Engineer - Professional).
Exposure to IaC tools (Terraform, Pulumi), or configuration management (Ansible, Puppet).
Experience in multi-cloud or hybrid-cloud environments (e.g., Azure, on-prem).
* Background in high-availability environments.
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.