Data Scientist Engineer

Ellesmere Port, ENG, GB, United Kingdom

Job Description

Job description



Hybrid, split between home and in-office Ellesmere Port



Where you'll be working



The employer is UK's pre-eminent provider of conveyancing panel management services with 30 years of unrivalled experience, delivering the highest quality bespoke services to meet the unique needs of each of our clients.

We are revolutionising the way we do software; we are a few years in on our exciting journey of transformation to be the best Tech company in the north west and beyond. To do this we want to discover better ways of developing software, and we are looking for people that want to help us get there.

We love Lean, Agile and XP but are always looking to explore new practices and improve the ways we work.

We like our teams to be autonomous, self-organising, value driven and customer centric.

The Job



We are seeking an experienced

Mid-Level Data Science Engineer.

You will work closely with our Head of Data Science, and our team of data scientists, data architects and data analysts in designing and implementing solutions that extract insights from complex unstructured text and image data.

This is a new initiative meaning the role will involve researching and developing new AI-powered tools and products using cutting-edge advances and developments in AI, machine learning, Natural Language Processing and Large Language Models (LLMs). You will work with senior colleagues to ensure that all work meets all legal and regulatory requirements applicable to the conveyancing market.

You will have the opportunity to work on diverse tasks including:

Advanced natural language processing, including topic modelling, bespoke entity and relationship recognition, text summarisation and ontology creation. Large Language Model integration from building conversational AI agents that respond to general queries, to fine-tuning LLMs and using advanced RAG techniques to understand legal contracts and other documents to auto-generate technical legal responses. Developing graph networks to provide 360-degree views of conveyancing and other data, client case management, and graphs to understand the property and home ownership network.
As the team grows there will be opportunities to develop systems that can process and understand complex planning images.

What You'll Do:



The core responsibilities for this role include (but is not limited to):

Preparing new data sources and conducting exploratory data analysis to identify patterns, trends, and methods to enhance the quality of our solutions and develop effective processing strategies and pipelines. Design, develop and implement machine learning models to solve complex business problems e.g. classification, clustering, decision trees etc. Apply Natural Language Processing techniques and models to tasks such as string pattern matching (e.g. RegEx and fuzzy matching), text classification, named entity recognition, knowledge search, topic modelling and text summarisation. Develop training and test datasets by using annotation tools to label text for complex ML/AI applications such as bespoke Named Entity Recognition. Review annotated data for accuracy and completeness, making corrections as needed. Develop methods to integrate and utilise transformer models, LLMs and advanced RAG techniques for natural language understanding and legal/technical text generation tasks. Stay updated on emerging AI/ML technologies and evaluate their potential for enhancing our solutions. Contribute to upskilling colleagues and clients by writing blogs, articles and presentations.

Qualifications



At least a 2:1 degree (preferably an MSc or Phd) in a quantitative subject e.g. Data Science, Statistics, Mathematics, Computer Science. A degree in another relevant subject will be considered. Evidence of at least two year's formal study in machine learning.

Required Skills:



At least 5 years working as a data scientist, and you will have the following proven experience:

Solid understanding of statistics and machine learning theory, e.g. regression, clustering, decision trees, dimensionality reduction etc. Experience with Natural Language Processing techniques and tasks, including fuzzy string searching (e.g. Levenshtein), entity and relationship extraction, POS tagging, lemmatization etc. Experience with extracting text from highly unstructured documents (letters, legal contracts, forms, reports, legal quotes, articles etc) in various file formats (PDFs, Word, Excel, images) leveraging Python and OCR techniques. Highly proficient and experienced in Python and with NLP and deep learning libraries including: TensorFlow, Keras, PyTorch, NLTK, spaCy, Hugging Face, Pandas, NumPy, Scikit-learn, OpenCV, Seaborn, Streamlit and Gen AI frameworks including Llangchain and Llamaindex etc. Experience working with relational databases and SQL-based query languages, and familiarity with Python toolkits for databases e.g. SQLAlchemy. Strong problem-solving skills and ability to handle complex challenges. Software engineering and development experience, including unit testing, version control, code reviews, and containerisation technologies (Docker). Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.

Preferred Skills:



The following additional skills would be highly advantageous:

Experience with Large Language Models such as GPT, BERT or open-source models (e.g. Llama2, Mistral), for natural language understanding and text generation tasks. Experience with fine-tuning LLMs using PEFT/LoRA, context embeddings, vector database technologies etc. Experience with knowledge graphs and ontologies. Experience with other languages, e.g. C++, Java, Javascript, R or Julia. Experience with NoSQL databases e.g. MongoDB, Neo4j and TigerGraph.
Benefits

23 days annual leave Casual Dress Additional holiday purchase scheme Hybrid Working Medicash Workplace Pension Duvet Day Childcare vouchers Corporate rate gym membership Bike to work scheme Flexible working (Hours & Home) Free on site Car parking Life assurance Perkbox Volunteering with a charity day Employee Assistant Programme
Your normal weekly hours will be 37.5 per week, from 9.00am to 5.30pm Monday to Friday with a one-hour break for lunch to be taken flexibly. LMS operate a hybrid working model between head office and home working.

Job Types: Full-time, Permanent

Pay: 40,000.00-65,000.00 per year

Schedule:

Monday to Friday
Experience:

Machine Learning: 2 years (required)
Work authorisation:

United Kingdom (required)
Work Location: In person

Job Type: Full-time

Pay: 40,000.00-65,000.00 per year

Benefits:

Casual dress On-site parking Sick pay Work from home
Schedule:

Monday to Friday No weekends
Experience:

Machine learning: 2 years (required)
Work Location: In person

Reference ID: Mid level Data Scientist - Hybrid
Expected start date: 02/06/2025

Beware of fraud agents! do not pay money to get a job

MNCJobs.co.uk will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD3161881
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Contract
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Ellesmere Port, ENG, GB, United Kingdom
  • Education
    Not mentioned