Data Engineer Forward Deployed Europe

London, ENG, GB, United Kingdom

Job Description

Description





We're building the data backbone for Orbital, an industrial AI system that ingests and learns from complex refinery and process data in real time. As our Data Engineer, you'll architect and maintain pipelines that make high-frequency time-series, lab, and historian data into a scalable

Lakehouse architecture

, usable for both deep learning models and real-time LLMs.




You'll be working across

AWS (EKS, S3, EBS, KMS, CloudWatch)

and

Databricks/PySpark

, ensuring data is contextualised, synchronised, and optimised for both deep learning models and real-time LLM workloads.




This isn't a traditional ETL role, you'll be solving problems at the intersection of

control systems, industrial data engineering, and AI enablement

.



Location:



Whilst you will be based in the Europe and or eligible to work here - this role will involve travel to other locations in India & USA.



Core Responsibilities





Ingest & Contextualise Data




Ingest from

OPC UA servers, process historians, IoT sensors, LIMS systems, alarms/events, and P&IDs

. Map signals to their physical processes (tags, units, hierarchies) for interpretability in AI pipelines.




Data Movement & Accessibility




Build pipelines that handle

real-time streaming and batch ingestion

into the Lakehouse. Manage

synchronisation between historian archives, unstructured files, and AWS storage (S3/EBS)

. Orchestrate

Databricks Lakeflow/Connectors

for integrating data into Lakebase/Lakehouse. Handle secure, high-throughput transfers between historian archives and sandbox/live environments.




Change Tracking & Integrity




Detect and manage schema changes, signal drift, and inconsistencies across time. Implement lineage and audit trails across Spark/Databricks and AWS pipelines.



Data Preparation for AI




Build and maintain dual pipelines: +

Training

large-scale historical data prep for time-series + LLM training.
+

Inference

low-latency, real-time pipelines for anomaly detection, optimisation, and LLM search.
Support heterogeneous AI workloads (time-series forecasting and retrieval-augmented LLMs).




Database Performance & Optimisation




Tune PostgreSQL and spark for high-throughput time-series workloads (partitioning, indexing, query optimisation). Optimise pipelines for both fast analytical queries and high-efficiency model training. Deploy and manage data pipelines in

AWS EKS (Kubernetes)

with persistent

EBS-backed storage

.


Technical Requirements




Deep expertise in

PostgreSQL

(partitioning, indexing, query optimisation, storage design). Strong proficiency in

Python

for data processing, scripting, and pipeline orchestration. Hands-on experience with

AWS (EKS, S3, EBS, IAM, KMS, CloudWatch, etc.)

for secure and scalable data pipelines. Proven ability to work with

Databricks and PySpark

for large-scale distributed data processing. Familiarity with

time-series industrial data

(control systems, DCS/SCADA logs, process historians). Experience in

unstructured data sync and management

within hybrid cloud/on-prem environments. Bonus: Knowledge of streaming frameworks (Kafka, Flink, Spark Streaming) or MLOps stacks for data versioning and lineage.


What Success Looks Like




Live data streams are contextualised, query-able, and AI-ready. Schema changes and signal drift are detected and handled without breaking downstream workflows. Training and inference pipelines run smoothly in parallel, optimised for scale and latency. * AI teams can focus on modelling because the data backbone is robust, fast, and reliable.

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3699418
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    London, ENG, GB, United Kingdom
  • Education
    Not mentioned