We are looking for a Senior Data Scientist with deep expertise in predictive modelling, stochastic simulation, and advanced analytics to join our Healthcare & Pharma data science team. In this role you will design and deploy Monte Carlo simulations, Bayesian models, and machine-learning pipelines that directly influence clinical trial strategy, patient-outcome forecasting, and portfolio-level decision-making. You will collaborate closely with biostatisticians, clinical operations, and medical affairs to translate complex data into actionable insights.
Key Responsibilities
• Design, implement, and validate Monte Carlo simulation models for clinical trial outcome prediction, enrolment forecasting, and risk quantification.
• Develop predictive and prescriptive analytics solutions (survival analysis, time-series forecasting, causal inference) using real-world data (RWD), electronic health records, and claims data.
• Build Bayesian hierarchical models for adaptive trial design, interim analysis support, and
probability-of-success estimation.
• Create reproducible ML pipelines (feature engineering, model training, hyperparameter tuning, deployment) on cloud platforms (AWS, GCP, or Azure).
• Partner with biostatistics and clinical teams to translate statistical findings into protocol amendments, site-selection strategies, and regulatory submissions.
• Develop interactive dashboards and data products (Streamlit, Shiny, or Tableau) that communicate model outputs to non-technical stakeholders.
• Conduct sensitivity analyses and scenario planning to quantify uncertainty in drug-development timelines and portfolio investment decisions.
• Mentor junior data scientists and contribute to internal best practices, code reviews, and
knowledge-sharing sessions.
Required Qualifications
• Master’s or Ph.D. in Statistics, Biostatistics, Data Science, Applied Mathematics, Computational Biology, or a related quantitative discipline.
• 5–8 years of hands-on experience building predictive models in a healthcare, pharma, or biotech setting.
• Strong proficiency in Python (NumPy, SciPy, pandas, scikit-learn, PyMC / Stan) and/or R.
• Demonstrated expertise in Monte Carlo methods (MCMC, importance sampling, bootstrapping) and stochastic simulation.
• Experience with survival analysis (Cox PH, Kaplan-Meier, competing risks) and longitudinal/
mixed-effects models.
• Working knowledge of clinical trial design (Phase I–IV), ICH-GCP guidelines, and regulatory data standards (CDISC, SDTM, ADaM).
• Proficiency with SQL and cloud-based data infrastructure (Snowflake, Redshift, BigQuery, Databricks).
• Excellent communication skills with the ability to present complex quantitative results to clinical, regulatory, and executive audiences.