Helical is building the in-silico labs for biology
Drug discovery still relies on wet labs: slow, expensive, and constrained by physical trial-and-error. Helical is changing that.
We build the application layer that makes Bio Foundation Models usable in real-world drug discovery, enabling pharma and biotech teams to run millions of virtual experiments in days, not years. Today, leading global pharma companies already use Helical, and we're at the start of a highly ambitious growth journey.
We're a founder-led, talent-dense team building a category-defining company from Europe. We care deeply about the quality of our work, move fast, and expect ownership. If you're excited by complexity, real responsibility, and shaping how a company actually operates as it scales, you'll feel at home here.
At Helical, we're focused on leveraging research to transform the future of drug discovery. We are seeking an
Applied Research Engineer - Post-Training
to join our team, focusing on maximizing the performance of cutting-edge foundation models in real-world applications.
Your Role
You will own the full post-training lifecycle for biological foundation models--from alignment strategy to production deployment. This means designing and running pipelines that transform general-purpose models into therapeutic-specific tools for our pharma clients. You'll work directly with real drug discovery problems: adapting models to disease areas, cell types, and perturbation contexts that matter for target identification, hit discovery, and beyond.
This isn't a support role. You'll make core technical decisions about how we extract value from foundation models--what to fine-tune, how to validate it biologically, and how to ship it to customers who are running experiments that inform real clinical programs. You'll collaborate closely with our ML infrastructure and biology teams, but you'll be the person responsible for whether our post-training actually works.
What You'll Do
Design and implement post-training pipelines that align biological foundation models to specific therapeutic contexts and client use cases.
Build validation frameworks that connect model improvements to biological ground truth--working with embeddings, perturbation data, and external resources like OpenTargets.
Own experiments end-to-end: from hypothesis through training runs on distributed GPU infrastructure to analysis and client delivery.
Collaborate with ML engineers on training infrastructure and with biologists on ensuring outputs are scientifically meaningful.
Contribute to our open-source tooling (helical-package) and help shape the technical direction of our post-training capabilities as we scale.
Stay at the frontier of post-training research and bring relevant advances into production.
Requirements
Essentials
MSc or PhD in Machine Learning, Computational Biology, or a related field--or equivalent depth gained through industry experience.
Hands-on experience with post-training techniques: fine-tuning, LoRA, DPO, RLHF, or similar alignment methods.
Strong proficiency in Python and PyTorch. You should be comfortable writing training loops, debugging distributed runs, and working directly with model internals.
Familiarity with transformer architectures and how they behave in practice--not just theory.
Experience designing and running experiments rigorously: tracking metrics, iterating systematically, and drawing valid conclusions from results.
Ability to work autonomously and make decisions with incomplete information. We're a small team; you'll own problems end-to-end.
Clear communication skills--you'll need to explain technical trade-offs to colleagues across ML, biology, and product.
Bonus Points
Experience with biological foundation models (Geneformer, scGPT, ESM, or similar) or computational biology more broadly.
Familiarity with drug discovery workflows, target identification, or perturbation biology.
Track record of shipping post-training improvements into production systems.
Experience with distributed training infrastructure (multi-GPU, multi-node, NCCL, DeepSpeed, FSDP).
Publications at ML or computational biology venues (NeurIPS, ICML, ICLR, Nature Methods, etc.).
* Contributions to open-source ML tooling.
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.