Senior Site Reliability Engineer

Leicester, ENG, GB, United Kingdom

Job Description

Hybrid requirements: This role has flexible working patterns.
Home. There's no place like it.
And there's no feeling like helping people create the joy of feeling truly at home. At Dunelm, that's what we do. We're the UK's number one choice for homewares because we make home life lovelier for our customers. And we've crafted a workplace that feels just as welcoming - where you can bring your ideas, be yourself, and feel right at home.
Remaining first-choice for savvy homeware shoppers also involves making use of advanced technology. We have embraced serverless, event-driven architecture and orchestrated containerised applications, with our monolithic front-end currently being replaced by micro front ends. You'll be working with a talented and collaborative group of engineers and architects who care about quality and reliability.
Site Reliability Engineering
Our SRE team is a high-trust, high-impact group of engineers who bring software engineering principles to operational reliability. We are hands-on developers and systems thinkers who build scalable, observable, and resilient platforms.
We work closely with other Engineering, Data, Platform and Operations teams to help them build reliable, observable, and cost-effective systems. We lead incident response, improve deployment safety, and guide teams toward sustainable service ownership.
We process large volumes of telemetry data every day and are constantly evolving our approach to cost-efficient observability, adaptive sampling, and meaningful tracing. Observability is not a bolt-on - it is a first-class concern that shapes how we build and support systems across the business.
This is a hybrid role, with time split between working from home and our London or Leicester offices. We get together as a team one day a month, but there may be an expectation of other ad-hoc office days where necessary.

What You Will Be Doing


Observability and OpenTelemetry: Own and evolve our observability strategy across services. Lead how we collect, process, sample, and surface trace and metrics data using OpenTelemetry. Focus on high-signal telemetry that enables fast diagnosis, cost efficiency, and meaningful visibility across the stack.
SLOs, SLIs, and Service Ownership: Help teams define and adopt meaningful SLIs and SLOs. Guide product teams in using observability data to make reliability measurable.
Incident Response and Reliability Engineering: Lead on-call investigations when issues arise. Drive blameless post-incident reviews and help to recommend mitigating actions that stem any losses, but also permanent technical fixes that prevent recurrence.
Infrastructure and Automation: Use Pulumi, Terraform, CDK etc. to model effective infrastructure in AWS and other PaaS and SaaS providers. Improve CI/CD pipelines and support safe deployment patterns, such as 'canary' and 'blue green'.
Engineering and Development: Build automation and reliability tooling using well-structured, testable code. Contribute to shared libraries, observability components and internal platforms.
Mentoring and Team Growth: Support and coach other engineers. Lead technical discussions and share knowledge through pairing, planning, and documentation.
Continuous Learning and Innovation: Stay ahead of emerging practices in observability, resilience, and platform engineering. Lead team proof-of-concepts and introduce new patterns or tools that improve our platform.
Strategic Development: Contribute to prioritisation of the SRE roadmap. Help shape observability tooling, telemetry patterns, and platform-wide approaches to service ownership and reliability.
Aligning to Business Goals: Use observability insights to support product and platform goals. Ensure SRE priorities align with Dunelm's wider objectives for quality, performance, and customer experience.

What We'll Look For In You

Essential Skills


Solid experience with TypeScript or similar strongly typed programming language(s).
Proven ability to write idiomatic, pragmatic, and testable code, with strong, appropriate, automated testing.
Knowledge and understanding of OpenTelemetry tools, specification, APIs etc.
Excellent understanding of SRE principles, including embracing risk, service level objectives, eliminating toil, monitoring distributed systems, automation and release engineering
AWS expertise, including Lambda, ECS/Fargate, EC2, EventBridge, SQS, S3, DynamoDB and general networking principles
System administration knowledge - able to comfortably use a command line to navigate and troubleshoot a server or container running a Linux OS
Knowledge and experience configuring and using telemetry back-ends, such as Datadog and the Grafana stack.
Experience with infrastructure-as-code tools, such as Pulumi and Terraform
Familiar with Kubernetes and how to deploy and monitor workloads running in k8s
Skilled in CI/CD pipelines (GitLab or similar) and build/test/deploy automation
Proven ability to lead incident response and post-incident review processes
Strong problem-solving mindset and attention to detail

Desirable Skills


Some experience in Rust or similar compiled language e.g. Go
Experience instrumenting and running OpenTelemetry in production at scale. Knowledge of distributed tracing and trace sampling
Experience reducing observability or cloud costs through architectural changes
Exposure to Google Cloud Platform (GCP)
Experience with Kubernetes observability, metrics exporters, or service mesh
Familiarity with challenges in the retail sector is a bonus but not expected

Behaviours and Values


Support and build trust with teammates, always assuming positive intent
Communicate clearly and share knowledge to build shared understanding
Stay curious, ask why, and always look to improve how things work
Embrace change, adapt quickly, and take on a variety of challenges
Drive innovation by looking for better ways forward and pushing for progress
See more
Role tech stack
TypeScript


Rust


Golang


Life at
Dunelm
Browse all roles
Culture overview
We're here to help our customers create the joy of feeling truly at home. Join us and you'll find our caring and inclusive culture makes this a place you'll feel right at home too.
Learn
Wherever you work with us and in whatever role, you'll have every opportunity to keep on learning and keep on growing.
Thrive
We'll take care of you, and make sure your everyday needs are met, so you can focus on doing a great job and being the best version of you.
Belong
We embrace diversity in all its forms. We'll celebrate the individual you are and value the unique contribution you bring.
Colleague Networks
All of our colleagues have the opportunity to be part of our four colleague networks. These are Disability & Neurodiversity, LGBTQ+, Gender Equality and Ethnicity & Race. Each network has co-chairs and an exec sponsor who work closely with us to ensure that we are a workplace where everyone feels supported, celebrated, valued and heard.
A chance to give something back
We're serious about our role in society. Each of our stores is partnered with a local charity and has its own community Facebook page. And we offer our Pausa Cafes for free to local community groups. We're also proud partners of the mental health charities, Mind (UK and Wales), SAMH (Scotland) and Inspire (Northern Ireland). And each year, we'll give you a day's paid leave to support a charity that matters to you.
Work your way
We have adapted our ways of working to make sure everyone can feel at home wherever they work. For many colleagues at our Head Office in Leicester and our Central London hub that now includes working on a hybrid basis, combining days in the office with time spent working at home or elsewhere across the business.
See more
Employee benefits
Bonus Scheme
Childcare Vouchers
Flexible Working
Free Parking
Laptop
Learning Allowance
Life Insurance
Pension
Private Healthcare
Share Options
Wellbeing Programme
Office vibe
BIRTHDAY OFF
CITY CENTRE
HACKATHONS
OFFICE DOG
OPEN PLAN
SOCIAL EVENTS
Location
Sorry, we have no imagery here.


Sorry, we have no imagery here.


Sorry, we have no imagery here.


Sorry, we have no imagery here.


Sorry, we have no imagery here.


Sorry, we have no imagery here.


Sorry, we have no imagery here.


Sorry, we have no imagery here.


Map
Satellite
Keyboard shortcuts
Map data 2025 Google
Terms
Report a map error
Tech at
Dunelm
Go to profile
Leadership
John Gahagan
Chief Technology and Information Officer
Tech overview
Our Tech, Digital and Data teams are transforming literally every aspect of our business - from the way we manage and make use of our data, to the relationships we share with our customers. Already, their impact has been felt across the business, and indeed by our customers. But this is just the start and we know there are bigger opportunities ahead.
Check out our tech blog for tales behind our talented teams: https://engineering.dunelm.com/

Keep on growing
Join us on the tech side and you'll have access to a huge array of learning and development opportunities, including a variety of internally created workshops and externally accredited courses. We also have a substantial tech-specific budget to fund e-Learning licenses, conference visits, resources, and qualifications, plus dedicated mentors, well-being buddies and a wide range of network groups to support you as you progress.

See more
Engineering principles
AGILE PROCESS
CODE REVIEWS
COMMUNICATION AND COLLABORATION
CONTINUOUS DELIVERY
CONTINUOUS DEVELOPMENT
CONTINUOUS INTEGRATION
INFRASTRUCTURE AS CODE
MENTORING
MICRO SERVICES
PAIR PROGRAMMING
SCRUM
TEST DRIVEN DEVELOPMENT
UNIT TESTING
Company tech stack
JavaScript


AWS Lambda


GraphQL


React


TypeScript


Jest


Node.js


SQL


Python


Java

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3455094
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Leicester, ENG, GB, United Kingdom
  • Education
    Not mentioned