Enterprise Tooling And Observability Lead Aws

London, ENG, GB, United Kingdom

Job Description

Tasks



Enterprise Tooling & Observability Lead - AWS



Location:

London



2months + with possible extension



Role Overview



We are looking for an experienced Enterprise Tooling & Observability Lead to drive the strategy, design, implementation, and modernization of enterprise monitoring, logging, APM, and operational tooling during and after large?scale on?prem to AWS cloud migrations.



The ideal candidate brings deep expertise across observability platforms, infrastructure/application monitoring, cloud?native operations, and integration of enterprise tools into cloud architectures. This role ensures a seamless migration of tooling capabilities, enhanced visibility, and improved reliability in the AWS operating model.



Key Responsibilities



1. Tooling & Observability Strategy Across Migration Lifecycle



Define the end-to-end tooling & observability architecture that supports pre?migration, migration, and post?migration operations. Assess current on?prem tooling (monitoring, logging, APM, ITSM, event management) and define cloud?aligned target tooling architecture for AWS. Build a unified observability roadmap covering metrics, logs, traces, dashboards, SLO/SLA monitoring, and event correlation.

1. Migration?Aware Observability Design



Identify tooling gaps that may arise during the migration of applications, networks, storage, and infrastructure. Ensure instrumentation readiness for applications moving via lift & shift, replatforming, containerization, and modernization. Define observability patterns for hybrid connectivity, multi-account AWS environments, and multi-region workloads.

1. AWS Cloud-Native Observability Integration



Design and implement observability using AWS-native capabilities such as: CloudWatch, CloudTrail, X-Ray, VPC Flow Logs, GuardDuty, Security Hub Integration with AWS Control Tower/Organizations for enterprise-wide visibility Ensure seamless integration with third?party enterprise tools such as: Datadog, Dynatrace, AppDynamics Splunk/ELK Prometheus/Grafana ServiceNow, Jira, PagerDuty Drive modernization of legacy monitoring solutions to cloud-native ecosystems.

1. Tooling Consolidation & Optimization



Evaluate existing tooling footprint and identify opportunities for consolidation, cost reduction, and simplification. Standardize tooling patterns and create reusable templates/playbooks for AWS workloads. Drive automation for alerting, dashboards, health checks, and operational insights.

1. Reliability, Performance & SRE Alignment



Collaborate with platform and SRE teams to enhance observability maturity (SLIs, SLOs, error budgets). Build proactive monitoring capabilities to reduce incidents, improve MTTR, and support predictive operations. Ensure the observability platform aligns with enterprise DR, HA, and performance engineering strategies.

1. Governance, Security & Compliance



Ensure observability tooling adheres to enterprise security, compliance, data governance, and access control policies. Define audit?ready logging strategies and ensure end?to?end traceability across hybrid and cloud environments. Build governance models for event noise reduction, alert hygiene, and service mapping accuracy.

1. Leadership & Stakeholder Management



Lead cross-functional teams across application, infrastructure, cloud, DevOps, and security functions. Serve as the primary SME for observability decisions, guiding teams through architectural design and implementation. Present observability strategy, migration readiness, platform health, and maturity improvements to senior leadership. Mentor engineers and drive capability uplift across the organization.

Requirements



Required Skills & Experience



Technical Expertise



14+ years of experience in enterprise monitoring, logging, APM, and observability tooling. Strong understanding of AWS architecture, cloud?native monitoring tools, and hybrid observability. Experience with: APM platforms: Dynatrace, AppDynamics, Datadog Logging platforms: Splunk, ELK/Opensearch, CloudWatch Logs Metrics & telemetry: Prometheus, Grafana, OpenTelemetry Event management: ServiceNow, PagerDuty, Moogsoft, BigPanda Strong knowledge of instrumentation for distributed systems, microservices, containers (EKS, ECS), serverless workloads, and legacy systems.

Migration & Architecture Skills



Proven experience supporting large-scale on?prem to AWS migrations. Deep understanding of migration patterns and observability dependencies. Hands-on experience designing observability for multi-account AWS landing zones and multi-region architectures.

Preferred Qualifications



AWS Certified Solutions Architect / Cloud Practitioner / DevOps Engineer Certifications in observability platforms (Datadog, Dynatrace, Splunk, etc.) Knowledge of ITIL, SRE principles, and enterprise operational frameworks * Experience with automation using Python, Terraform, CloudFormation (nice-to-have)

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4580962
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Full Time
  • Job Location
    London, ENG, GB, United Kingdom
  • Education
    Not mentioned