Hybrid, split between home and in-office Ellesmere Port
Where you'll be working
The employer is UK's pre-eminent provider of conveyancing panel management services with 30 years of unrivalled experience, delivering the highest quality bespoke services to meet the unique needs of each of our clients.
We are revolutionising the way we do software; we are a few years in on our exciting journey of transformation to be the best Tech company in the north west and beyond. To do this we want to discover better ways of developing software, and we are looking for people that want to help us get there.
We love Lean, Agile and XP but are always looking to explore new practices and improve the ways we work.
We like our teams to be autonomous, self-organising, value driven and customer centric.
The Job
We are seeking an experienced
Mid-Level Data Science Engineer.
You will work closely with our Head of Data Science, and our team of data scientists, data architects and data analysts in designing and implementing solutions that extract insights from complex unstructured text and image data.
This is a new initiative meaning the role will involve researching and developing new AI-powered tools and products using cutting-edge advances and developments in AI, machine learning, Natural Language Processing and Large Language Models (LLMs). You will work with senior colleagues to ensure that all work meets all legal and regulatory requirements applicable to the conveyancing market.
You will have the opportunity to work on diverse tasks including:
Advanced natural language processing, including topic modelling, bespoke entity and relationship recognition, text summarisation and ontology creation.
Large Language Model integration from building conversational AI agents that respond to general queries, to fine-tuning LLMs and using advanced RAG techniques to understand legal contracts and other documents to auto-generate technical legal responses.
Developing graph networks to provide 360-degree views of conveyancing and other data, client case management, and graphs to understand the property and home ownership network.
As the team grows there will be opportunities to develop systems that can process and understand complex planning images.
What You'll Do:
The core responsibilities for this role include (but is not limited to):
Preparing new data sources and conducting exploratory data analysis to identify patterns, trends, and methods to enhance the quality of our solutions and develop effective processing strategies and pipelines.
Design, develop and implement machine learning models to solve complex business problems e.g. classification, clustering, decision trees etc.
Apply Natural Language Processing techniques and models to tasks such as string pattern matching (e.g. RegEx and fuzzy matching), text classification, named entity recognition, knowledge search, topic modelling and text summarisation.
Develop training and test datasets by using annotation tools to label text for complex ML/AI applications such as bespoke Named Entity Recognition. Review annotated data for accuracy and completeness, making corrections as needed.
Develop methods to integrate and utilise transformer models, LLMs and advanced RAG techniques for natural language understanding and legal/technical text generation tasks.
Stay updated on emerging AI/ML technologies and evaluate their potential for enhancing our solutions.
Contribute to upskilling colleagues and clients by writing blogs, articles and presentations.
Qualifications
At least a 2:1 degree (preferably an MSc or Phd) in a quantitative subject e.g. Data Science, Statistics, Mathematics, Computer Science. A degree in another relevant subject will be considered.
Evidence of at least two year's formal study in machine learning.
Required Skills:
At least 5 years working as a data scientist, and you will have the following proven experience:
Solid understanding of statistics and machine learning theory, e.g. regression, clustering, decision trees, dimensionality reduction etc.
Experience with Natural Language Processing techniques and tasks, including fuzzy string searching (e.g. Levenshtein), entity and relationship extraction, POS tagging, lemmatization etc.
Experience with extracting text from highly unstructured documents (letters, legal contracts, forms, reports, legal quotes, articles etc) in various file formats (PDFs, Word, Excel, images) leveraging Python and OCR techniques.
Highly proficient and experienced in Python and with NLP and deep learning libraries including: TensorFlow, Keras, PyTorch, NLTK, spaCy, Hugging Face, Pandas, NumPy, Scikit-learn, OpenCV, Seaborn, Streamlit and Gen AI frameworks including Llangchain and Llamaindex etc.
Experience working with relational databases and SQL-based query languages, and familiarity with Python toolkits for databases e.g. SQLAlchemy.
Strong problem-solving skills and ability to handle complex challenges.
Software engineering and development experience, including unit testing, version control, code reviews, and containerisation technologies (Docker).
Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.
Preferred Skills:
The following additional skills would be highly advantageous:
Experience with Large Language Models such as GPT, BERT or open-source models (e.g. Llama2, Mistral), for natural language understanding and text generation tasks.
Experience with fine-tuning LLMs using PEFT/LoRA, context embeddings, vector database technologies etc.
Experience with knowledge graphs and ontologies.
Experience with other languages, e.g. C++, Java, Javascript, R or Julia.
Experience with NoSQL databases e.g. MongoDB, Neo4j and TigerGraph.
Benefits
23 days annual leave
Casual Dress
Additional holiday purchase scheme
Hybrid Working
Medicash
Workplace Pension
Duvet Day
Childcare vouchers
Corporate rate gym membership
Bike to work scheme
Flexible working (Hours & Home)
Free on site Car parking
Life assurance
Perkbox
Volunteering with a charity day
Employee Assistant Programme
Your normal weekly hours will be 37.5 per week, from 9.00am to 5.30pm Monday to Friday with a one-hour break for lunch to be taken flexibly. LMS operate a hybrid working model between head office and home working.
Job Types: Full-time, Permanent
Pay: 40,000.00-65,000.00 per year
Schedule:
Monday to Friday
Experience:
Machine Learning: 2 years (required)
Work authorisation:
United Kingdom (required)
Work Location: In person
Job Type: Full-time
Pay: 40,000.00-65,000.00 per year
Benefits:
Casual dress
On-site parking
Sick pay
Work from home
Schedule:
Monday to Friday
No weekends
Experience:
Machine learning: 2 years (required)
Work Location: In person