Description and requirements
As a Network Production Engineer, you will be a critical member of the team responsible for the full lifecycle of our global network infrastructure that supports Bloomberg's core products and services. This includes building and maintaining a network that is scalable, reliable and robust. Our network is vast, connecting several large-scale Data Centers and over a hundred edge sites. It connects Bloomberg to hundreds of thousands of our clients, over 1,500 global exchanges and trading venues over private connectivity, Internet and Public Cloud.
This is a unique opportunity to help build robust, highly scalable solutions that will power the future of how Bloomberg automates network infrastructure. You'll be trusted to design and work on tooling that builds on automation best practices and principles.
We'll trust you to
Develop and maintain software tools to manage a large-scale, multi-vendor network with an emphasis on automation, telemetry, and model-driven infrastructure as code.
Automate the full network lifecycle--including provisioning, configuration, observability, testing, troubleshooting, and capacity planning.
Collaborate with architecture and design teams and the CTO office to implement new technologies that ensure scalability, efficiency, and operational resilience.
Develop tools and platforms that enhance the observability, reliability, and performance of the production network.
Enhance existing monitoring and observability frameworks, integrating intelligent alerting and self-remediation capabilities to reduce manual intervention and improve incident response.
Define and measure service-level objectives (SLOs) to track infrastructure performance and reliability.
Write software utilizing orchestration systems to automate tasks and interact with other systems.
Provide mentorship to junior engineers and promote software engineering best practices throughout the team.
Practice and promote the use of a modern software development lifecycle.
You need to have
Extensive experience as a Software, Network Production, or System Reliability Engineer.
Experience with building, maintaining and continuously enhancing automations needed for scalability & efficiency in running the Network Infrastructure.
Experience in infrastructure Automation and orchestration Frameworks e.g. Ansible, Airflow, Terraform, Chef, Salt.
Proven experience with object-oriented programming languages preferably in Python.
A bachelor's or master's degree in computer science, Engineering, Mathematics, a similar field of study or equivalent work experience.
We'd love to seeExperience managing and automating network devices at scale such as Juniper, Nokia, Arista, Cisco, Whitebox etc.
An understanding of various Network architectures across Internet, Public Cloud, Private Networks, DWDM and Optical Networking, Data Centre builds and design fundamentals. etc.
Experience with network modelling
Eagerness to learn new technologies and mentor others
Experience with Telemetry: Splunk, Grafana, Humio
Experience with continuous integration and deployment tools
Experience implementing, maintaining and troubleshooting MPLS, BGP, OSPF, IGMP, PIM related internal and external network routing issues in a production environment
* Knowledge with messaging queues such as Kafka, RabbitMQ, etc
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.