You will join the team behind an internal AI platform for processing and interacting with unstructured data. The team is currently over 30 people strong and is organized into agile teams, each of which is self-sufficient and handles the creation of features from the idea stage, through analysis, implementation, testing, production deployment, and maintenance. The team is international, and it's located in Krakow, Wroclaw, London and New York.
Responsibilities
Design, build, and maintain scalable data pipelines using Python and Azure Data Factory
Work with Azure SQL and PostgreSQL to ingest, transform, and store structured and unstructured data
Develop and optimize ETL/ELT processes for high-volume data workflows
Use Databricks to process large datasets and build data models for downstream AI/ML components
Collaborate with data scientists, backend engineers, and product teams to understand data requirements
Ensure data quality, integrity, and security across all stages of the data lifecycle
Manage infrastructure as code using Terraform for provisioning and maintaining cloud resources
Contribute to CI/CD practices using Azure DevOps for data pipeline deployments and versioning
Support analytics and reporting teams by enabling data access via Power BI or similar tools
NY
100000
120000 USD Gross Per Year
Skills
Must have
Experience in similar position +7 years
Strong programming skills in Python for data processing and scripting
Experience with Azure Data Factory (ADF) for building and orchestrating data pipelines
Proficiency in working with Azure SQL and PostgreSQL databases
Hands-on experience with Databricks for big data processing and transformation
Solid understanding of data engineering concepts: ETL/ELT, data modeling, data quality
Familiarity with infrastructure as code using Terraform
Experience with Azure DevOps for CI/CD pipelines and version control
Ability to work with unstructured data and integrate it into structured models
Experience in agile development environments and cross-functional teams
Good communication skills and ability to work in an international, distributed team
Nice to have
Experience with Power BI or other BI tools for data visualization and reporting
Knowledge of Spark and distributed data processing concepts
Familiarity with Delta Lake or similar data lakehouse architectures
Understanding of data governance, lineage, and cataloging tools (e.g. Azure Purview)
Basic knowledge of machine learning workflows or support for data science teams
Experience working with APIs for data ingestion or integration
Familiarity with containerization tools like Docker or Kubernetes
Exposure to monitoring and alerting tools for data pipeline health (e.g. Azure Monitor, Grafana)
Knowledge of data security best practices and compliance (e.g. GDPR, data encryption)
Prior experience working on AI-related or unstructured data projects
Other
Languages
English: C1 Advanced
Seniority
Senior
London, United Kingdom of Great Britain and Northern Ireland
Req. VR-116378
BI Engineering
BCM Industry
01/08/2025
Req. VR-116378
Beware of fraud agents! do not pay money to get a job
MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.