Vacancy Name
 Production Engineering Manager  
Vacancy No
 VN1363  
Employment Type
 Regular Full-Time  
Location
 London  
Job Details
We're seeking a Production Engineering Manager to join our team. This is a hands-on leadership role where you'll be responsible for the stability and performance of complex, high-availability trading and financial systems. You'll lead by example, working alongside your team to resolve critical issues and drive continuous improvement.  
Responsibilities:
Lead and mentor a team of production support engineers, fostering a culture of excellence and ownership.
Act as a hands-on leader, actively participating in the troubleshooting and resolution of complex incidents, and performing root cause analysis.
Manage the entire incident lifecycle, from initial detection and triage to resolution and post-mortem analysis, ensuring minimal impact on our trading and financial operations.
Drive problem management efforts to identify and resolve the root cause of recurring incidents, preventing future outages and improving system reliability.
Develop and maintain comprehensive support procedures, documentation, and runbooks to streamline operations and ensure a consistent response to issues.
Drive service improvement initiatives, leveraging ITIL principles to enhance our incident, problem, change, and release management processes.
Collaborate closely with development, quality assurance, and infrastructure teams to identify and address systemic issues, improve system resilience, and ensure smooth deployments.
Monitor system performance and health, proactively identifying potential issues and implementing solutions to prevent future incidents.
Manage scheduling and on-call rotations and ensure 24/7 support coverage for critical systems.
Qualifications:
10+ years of experience in production support, with a strong background in supporting complex, high-availability trading and financial systems.
Proven hands-on leadership experience, demonstrating the ability to lead a team while actively contributing to technical tasks.
Extensive experience with ITIL Service Management principles and practices is a must. Certification is a significant plus.
Technical proficiency in one or more programming languages (e.g., Python, Java, C#) and experience with SQL and/or NoSQL databases.
Solid understanding of modern software architecture, microservices, and cloud technologies (e.g., AWS, Azure, GCP).
Exceptional problem-solving and critical-thinking skills, with the ability to quickly diagnose and resolve complex technical issues under pressure.
Excellent communication and interpersonal skills, with the ability to effectively collaborate with a wide range of stakeholders, from technical teams to business users.
Experience with monitoring and logging tools such as Splunk, ELK Stack, Prometheus, or Grafana.
Working Hours: 40/week, Monday-Friday. Hybrid: 3 days in-office.  
Please submit your CV in English. Only shortlisted candidates will be contacted for an interview.  
All Stratos Market Limited employees must be eligible to work in United Kingdom.
Prior to submitting your resume, the firm requests that you do the following:
Review the firm's website thoroughly at https://www.tradu.com/uk/
Company Description  
Tradu is a new multi-asset global trading platform and is part of the Stratos group of companies. Tradu, built by traders for traders, provides the most sophisticated traders with a serious platform that allows them to move easily between asset classes such as stocks, CFDs and crypto, depending on the regulations that govern the trader's market.
Equal Opportunity Employer               
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.