Senior Site Reliability Engineer Job in finova

Senior Site Reliability Engineer

London, ENG, GB, United Kingdom

finova

18 Current Jobs Openings

Apply Now

Job Description

Senior Site Reliability Engineer - London

About Finova

Finova is the UK's largest mortgage and savings technology provider, powering one in every five mortgages across the country. Our agile, cloud-native solutions help over 60 banks, building societies, specialist lenders and equity release providers -- plus a network of 2,400+ brokers -- stay ahead of the market.

We offer a flexible, proven suite of software that covers the full customer journey -- from mortgage and savings origination to servicing and CRM. Backed by an open architecture and a team with deep industry expertise, our platform is built to scale. Today, we process over 50 billion in loans each year, manage nearly 50 billion in savings, and support the digital servicing of more than 650,000 UK borrower accounts.

For Lenders -

Finova offers a flexible, modular technology suite designed to help lenders move faster, scale efficiently, and deliver standout digital experiences. You can run your entire mortgage and savings business on Finova -- or just use the modules you need, tailored by our team or configured in-house through our low-code platform.

Our solutions include:

Lending - end-to-end mortgage and specialist lending software built for speed, flexibility, and scale. Decisioning - powerful tools for smarter, more personalised lending decisions. Servicing - intuitive workflows and automation to simplify day-to-day account management and customer servicing. Savings - configurable onboarding and customer engagement tools across all savings products. Intermediary Manager - broker relationship and compliance management, powered by real-time external data. Broker ID - fast, automated broker verification and compliance tracking using KYB, KYC, and live monitoring from public data sources.

Lenders use Finova to launch products faster, process applications up to 50% more efficiently, and reduce operational costs -- all while staying fully compliant in a fast-moving market.

About the Role:

We are seeking an experienced SRE to spearhead the Site Reliability Engineering function. As an SRE, you will be responsible for the availability, scalability, and performance of our core systems, with a particular focus on monitoring our .NET applications deployed in cloud environments such as AKS, EKS, App Services, and VMs. You will work independently and collaboratively with various engineering teams to ensure our systems meet the highest standards of reliability and operational excellence.

What will you be doing?

Monitoring and Performance Optimization:

Design, implement, and maintain robust monitoring and alerting systems for .NET applications running in AKS, EKS, App Services, and VMs. Analyse system performance metrics, establish baselines, identify bottlenecks, and implement improvements for scalability and efficiency. Set up, configure, and optimize observability tools (e.g., Prometheus, Grafana, Datadog, etc.) to monitor key system metrics, logs, and traces.

Reliability and Incident Management:

Ensure high availability and disaster recovery for all critical systems. Lead incident response efforts and post-incident analysis to mitigate recurrence and improve system resilience. Develop and maintain SLOs, SLIs, and error budgets, ensuring services meet agreed-upon reliability targets.
Automation and Infrastructure Management:

Automate routine tasks and processes to improve efficiency and reduce manual errors. Work with infrastructure-as-code tools (e.g., Terraform, Ansible, Bicep) to manage cloud resources effectively. Collaborate with DevOps and CloudOps teams to build and deploy infrastructure using CI/CD pipelines (e.g., Azure DevOps, GitLab CI).
Collaboration and Mentorship:

Work closely with product development teams to ensure smooth application releases and system performance. Provide mentorship and guidance to junior SREs and engineers. Drive best practices in terms of reliability, monitoring, and incident management across the engineering organization.
Continuous Improvement:

Identify areas for improvement in our infrastructure, monitoring, and reliability practices. Stay up-to-date with industry trends, tools, and technologies to continuously improve our operational processes.

About You:

In terms of your experience, your attitude is everything, but we'd particularly love to see: 5+ years of experience in Site Reliability Engineering, DevOps, or Systems Engineering, with a strong focus on monitoring, alerting and incident management. Hands-on experience monitoring .NET applications in production environments, preferably using tools like Grafana, Datadog and Azure Monitor. Extensive experience with AKS, EKS, App Services, and VMs in cloud environments (AWS, Azure). Proven ability to work independently and manage multiple projects in a fast-paced environment.
Technical Skills:

Strong proficiency in cloud platforms (AWS, Azure), container orchestration (Kubernetes, AKS, EKS), and microservices architecture. Proficiency in infrastructure-as-code tools like Terraform, Azure Resource Manager, or similar. Experience with monitoring and observability tools such as Prometheus, Grafana, Datadog Strong scripting skills (e.g., PowerShell, Bash, Python).
Soft Skills:

Excellent communication skills, both verbal and written, with the ability to convey complex technical concepts to non-technical stakeholders. Strong problem-solving abilities and the ability to troubleshoot complex systems under pressure. A proactive and collaborative approach to work, with the ability to lead by example.
Preferred Qualifications:

Experience with monitoring and maintaining financial services or FinOps platforms. Certifications in cloud platforms (AWS Certified Solutions Architect, Azure DevOps, Kubernetes Certified Administrator). Experience with scaling and maintaining high-performance systems with large data throughput.

What We Offer:

Hybrid working:

At Finova, we believe the best outcomes come from working together - and having the flexibility to work in a way that suits both our people and our business. We operate a hybrid working model, with most teams spending around three days a week in the office and with our customers. This time together helps us stay connected, collaborate more effectively, and solve complex challenges as a team. We also know that flexibility matters. Our approach is designed to support a healthy balance, combining in-person collaboration with the freedom to work remotely where it makes sense.

Holiday

: 25 days holiday plus bank holidays, bank holiday trading and holiday purchase options, the opportunity to work from anywhere in the world for up to 4 weeks per year.

Looking After You

: Life Assurance, Group Income Protection, Private Medical Insurance, a pension scheme via Salary Exchange, an Employee Assistance Programme, and access to a Virtual GP.

Family-Friendly Policies

: Enhanced maternity and paternity pay, as well as paid time off for fertility treatments and pregnancy loss.

Extra Perks

: Cycle to Work Scheme, discounts on shops, restaurants, and gym memberships, free fresh fruit daily, and opportunities to join colleague networks and social groups.

Giving Back

: One paid volunteering day annually and the Give-As-You-Earn scheme to support your favourite charities.

Equal Opportunity Statement

We value diversity and are committed to creating an inclusive environment for all employees. If you're passionate about this role but don't meet all the criteria, please reach out--we'd love to discuss how your skills and experiences align with our needs.

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Related Jobs

Senior Site Reliability Engineer

Lloyds Banking Group

Leeds, ENG, GB

Apply Now
Senior Site Reliability Engineer

finova

London, ENG, GB

Apply Now

Senior Site Reliability Engineer (Node.js & Javascript), Trading Technologies

Binance

Remote, GB

Apply Now
Senior Site Reliability Engineer

BP

Sunbury, ENG, GB

Apply Now

Job Detail

Job Id

JD3839007
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Full Time
Job Location

London, ENG, GB, United Kingdom
Education

Not mentioned

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers