This is a foundational, high-impact role at the core of Convergent's AI platform. As a
Data Scientist & Data Engineer
, you'll own the end-to-end data and experimentation backbone that powers our adaptive simulations and human-AI learning experiences. You'll build reliable pipelines, define data products, and run rigorous analyses that translate real-world interactions into measurable improvements in model performance, user outcomes, and product decisions.
You will
Partner with product, AI/ML, cognitive science, and frontend teams to turn raw telemetry and user interactions into
decision-ready datasets, metrics, and insights
.
Design and build
production-grade data pipelines
(batch + streaming) to ingest, transform, validate, and serve data from product events, simulations, and model outputs.
Own the
analytics layer
: event schemas, data models, semantic metrics, dashboards, and self-serve data tooling for the team.
Develop and maintain
offline/online evaluation datasets
for LLM-based experiences (e.g., quality, safety, latency, user outcome metrics).
Build
experiment measurement
frameworks: A/B testing design, guardrails, causal inference where applicable, and clear readouts for stakeholders.
Create
feature stores / feature pipelines
and collaborate with ML engineers to productionize features for personalization, ranking, and adaptive learning.
Implement
data quality and observability
: anomaly detection, lineage, SLAs, automated checks, and incident response playbooks.
Support privacy-by-design and compliance: PII handling, retention policies, and secure access controls across the data stack.
Requirements
2+ years of experience in
data engineering, data science, analytics engineering
, or a similar role in a fast-paced environment.
Strong proficiency in
Python
and
SQL
; comfortable with data modeling and complex analytical queries.
Hands-on experience building
ETL/ELT pipelines
and data systems (e.g., Airflow/Dagster/Prefect; dbt; Spark; Kafka/PubSub optional).
Experience with modern data warehouses/lakes (e.g.,
BigQuery, Snowflake, Redshift, Databricks
) and cloud infrastructure.
Strong understanding of
experimentation
and measurement: A/B tests, metrics design, and statistical rigor.
Familiarity with LLM-adjacent data workflows (RAG telemetry, embeddings, evaluation sets, labeling/synthetic data) is a plus.
Comfortable operating end-to-end: from ambiguous problem definition implementation monitoringiteration.
Clear communicator with a collaborative mindset across product, design, and engineering.
Nice to have
Experience with
real-time analytics
and event-driven architectures.
Knowledge of
recommendation/personalization
systems and feature engineering at scale.
Experience with