Lead Data Scientist - Causal Inference

Location

Remote restrictions apply

See all remote locations

Salary Estimate

N/A

Seniority

Lead

Tech stacks

Visa

U.S. visa required

Permanent role

5 days ago

Apply now

Lead Data Scientist - Causal Inference

$180,000-$210,000

New York Metro Area - Hybrid

Our client is on a mission to revolutionize healthcare quality by leveraging advanced data science and analytics. They partner with healthcare providers, payers, and employers to improve patient outcomes through cutting-edge technology and evidence-based insights. With strategic partnerships across major organizations and significant recent investment backing, they are poised to make a transformative impact on healthcare delivery nationwide. Their work focuses on ensuring patients receive the highest standard of care, starting with diagnostic accuracy and expanding into broader healthcare analytics.

THE ROLE

We are seeking a highly skilled Lead Data Scientist who is working heavily with causal inference, specializing in evaluating the impact of healthcare programs through rigorous claims data analysis. The ideal candidate will have strong programming expertise (R/Spark/SQL/Python) and a deep understanding of medical claims data. In this role, you will contribute to developing scalable data pipelines, optimizing codebases, and conducting advanced statistical modeling to support data-driven decision-making.

RESPONSIBILITIES

Work with large-scale healthcare datasets, including longitudinal claims data, to assess the relationships between healthcare quality and patient outcomes
Design, maintain, and improve data science pipelines that generate modeling datasets, run statistical models, and produce business reports
Enhance the efficiency, reproducibility, and scalability of data ETL and statistical code while addressing technical challenges as they arise
Conduct targeted analyses to uncover business and clinical insights that inform strategy and decision-making
Apply statistical methodologies, including Generalized Linear Models (GLMs), causal inference techniques, and difference-in-differences analysis, to evaluate program effectiveness and ROI
Prepare documentation, presentations, and insights for both internal stakeholders and external clients

YOUR BACKGROUND

PhD in Computer Science, Statistics, Biostatistics, Economics, Data Science, Applied Mathematics, or a related field
Proficiency in R, Spark, SQL, and Python for data science applications, with a strong emphasis on R
Experience with scalable code development and collaborative coding environments
Ability to troubleshoot complex data pipelines and statistical code
Experience working with medical and claims data, including familiarity with ICD codes, EHR/EMR data
Knowledge of Generalized Linear Models (GLMs), mixed models, and longitudinal data analysis
Strong collaborative mindset and ability to work in fast-paced, team-oriented environments
Exposure to causal inference techniques (e.g., propensity score matching, difference-in-differences) is a plus
Experience in payer organizations, healthcare consulting, or client-facing analytics roles is a plus
Experience applying machine learning models, including classification, regression, clustering, and anomaly detection, to healthcare datasets

HOW TO APPLY

If you believe you are a good fit given the above qualifications, send your resume to Grace via the link below.

KEYWORDS

Data Science, Healthcare Analytics, Medical Claims Data, Statistical Modeling, Causal Inference, R, Spark, SQL, Python, Generalized Linear Models, Machine Learning, Healthcare Quality Improvement, Data Pipelines, ETL, Big Data, Data Engineering, AI in Healthcare