Qa Engineer Load Testing Specialist (2 Months Contract)

London, ENG, GB, United Kingdom

Job Description

Position Overview


---------------------


Monolith AI is seeking an experienced QA Engineer to lead load testing efforts for a critical system


release focused on improving concurrency and high request load handling. This fast-paced, short-


term engagement requires someone who can quickly understand complex distributed systems,


design comprehensive load tests, and work collaboratively with a rapidly growing engineering team


to ensure our new environment meets performance requirements.

Primary Responsibilities


----------------------------

Design and Implement Automated Load Testing Framework

Develop comprehensive load tests for FastAPI endpoints, Temporal workflows/

activities, and AWS service interactions

Create realistic test scenarios simulating concurrent workflow execution patterns,

including graph-based workflow orchestration

Build automated test suites that measure system behavior under varying concurrency

levels and request loads

Performance Analysis and Bottleneck Identification

Monitor and analyze system performance across the entire stack (API layer,

Temporal workers, AWS services)

Identify concurrency limitations in Temporal workflow execution, AWS service

limits (Athena, ECS), and inter-component communication

Document performance characteristics including response times, throughput limits,

and failure modes under load

Collaborate on Non-Functional Requirements (NFR) Definition

Work with Customer Success and Product teams to understand business

requirements and translate them into measurable performance criteria

Iterate on acceptable concurrency thresholds, latency targets, and throughput

requirements Validate that proposed NFRs are realistic and achievable given architectural


constraints

System Documentation and Knowledge Extraction

Understanding of the existing system through code review, discussions with the

development team, and exploratory testing

Create clear documentation of test methodologies, results, and recommendations for

future testing

Recommendation and Optimization Guidance

Provide actionable recommendations for removing identified bottlenecks Suggest configuration optimizations for Temporal (worker pools, task queues) and

AWS services (Athena concurrency, ECS capacity)

Rapid Communication and Status Reporting

Maintain daily/frequent communication with the Tech Lead regarding project

progress, blockers, and findings

Quickly escalate issues that could impact the aggressive timeline Present findings and recommendations to technical and non-technical stakeholders

Cross-Component Integration Testing

Test complex scenarios involving graph execution triggering node workflows across

multiple system boundaries

Validate S3 read/write operations under concurrent load Ensure inter-component communication (API Temporal, Temporal Activity

API triggers) performs reliably at scale

Key Performance Indicators


------------------------------

Test Coverage and Execution

Complete automated load test suite covering all critical components within first 3

weeks

Execute baseline and progressive load tests identifying maximum sustainable

concurrency levels

Bottleneck Identification and Impact

Identify and document top 5-7 performance bottlenecks with clear impact analysis Provide actionable remediation recommendations with estimated effort and impact

for each bottleneck


3.

NFR Definition and Validation



Collaborate with stakeholders to define measurable NFRs within first 2 weeks Validate system meets or document gaps against agreed NFR criteria by project end

Documentation and Knowledge Transfer

Deliver comprehensive test documentation, results analysis, and system performance

characteristics

Conduct knowledge transfer sessions ensuring team can maintain and extend testing

framework

Project Velocity and Communication

Meet weekly milestone targets in this fast-paced 2-month engagement Maintain proactive communication rhythm (daily standups, weekly detailed reports

to Tech Lead)

Required Qualifications


---------------------------

Experience:



4+ years of experience in QA/performance testing roles 2+ years of hands-on experience with load testing distributed systems and microservices

architectures

Proven experience with load testing tools (e.g., k6, JMeter, Locust, Gatling, Artillery) Experience testing workflow orchestration systems (Temporal, Airflow, Prefect, or similar) Demonstrated ability to test systems integrating with AWS services (particularly Athena,

ECS, S3)

Technical Skills:



Strong proficiency in Python (required for test automation and working with FastAPI/

Temporal)

Experience with REST API testing and performance validation Understanding of distributed systems concepts: concurrency, queueing, backpressure, rate

limiting

Familiarity with AWS infrastructure and service limitso Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or

similar)

Proficiency with Git and CI/CD pipelines Ability to read and understand code in order to design effective tests

Immediate Availability:



Ability to start in early January 2025 and commit to focused 3-month engagement Availability for full-time contract work during project duration

Preferred Qualifications


----------------------------

Direct experience with http://Temporal.io (workflows, activities, workers) Experience with containerized workloads and Docker/ECS Prior work in fast-paced startup or scale-up environments Experience with infrastructure-as-code (Terraform, CloudFormation) Background in Site Reliability Engineering (SRE) or DevOps practices Familiarity with data processing pipelines and analytics systems Previous contract/consulting experience with rapid knowledge acquisition Experience with graph-based workflow systems or DAG execution engines Knowledge of AWS service limits and optimization strategies

Essential Soft Skills


-------------------------

Self-Direction and Initiative:



Ability to operate independently in an ambiguous, fast-moving environment with minimal

documentation

Proactive problem-solving mindset; doesn't wait for perfect information before taking action Comfortable making pragmatic decisions quickly in a time-constrained project

Communication and Collaboration:



Exceptional communication skills for extracting knowledge through conversations with

existing team members

Ability to translate technical findings into clear, actionable recommendations for diverse

audienceso Comfortable asking clarifying questions and challenging assumptions respectfully

Strong written communication for documentation and status updates

Adaptability and Learning Agility:



Quick learner who can rapidly understand complex, poorly documented systems Flexible and comfortable with changing priorities in a 15-person team that's doubling in size Thrives in fast-paced environments with aggressive timelines

Pragmatism and Results Orientation:



Focused on delivering practical, actionable outcomes within tight timeframes Understands the balance between thoroughness and speed in a 2-month engagement Comfortable with "good enough" when perfect isn't achievable within constraints

Stakeholder Management:



Skilled at managing expectations with technical leadership about realistic timelines and

trade-offs

Diplomatic when delivering difficult news about performance limitations or bottlenecks Collaborative approach when working with CS and Product on NFR definition

Key Challenges in This Role


-------------------------------

Rapid Knowledge Acquisition with Limited Documentation

The existing system lacks comprehensive documentation, requiring you to quickly

build understanding through code review, system exploration, and frequent


discussions with the development team

Success requires comfort with ambiguity and strong investigative skills

Aggressive Timeline with High Impact

A 3-month timeline to design tests, execute comprehensive load testing, identify

bottlenecks, and deliver actionable recommendations is extremely tight

Must balance thoroughness with pragmatism; prioritize ruthlessly to ensure critical

areas are covered

Complex Distributed System with Multiple Integration Points

The system involves multiple layers (FastAPI, Temporal, AWS services) with

complex inter-component communication patterns (graph node workflows) Must understand the entire stack sufficiently to design realistic, comprehensive load


tests that expose real-world bottlenecks

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD4540899
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Full Time
  • Job Location
    London, ENG, GB, United Kingdom
  • Education
    Not mentioned