Lead Data Scientist - Healthcare

Location

Remote restrictions apply

See all remote locations

Salary Estimate

N/A

Seniority

Lead

Tech stacks

Project management

Data

+27

Visa

U.S. visa required

Permanent role

24 days ago

Apply now

About The Role

HiCounselor is assisting one of our clients in hiring a Lead Data Scientist – Healthcare to spearhead data science initiatives in the healthcare domain. The ideal candidate will be responsible for leading end-to-end projects, applying advanced analytics and AI to healthcare data, and delivering insights that drive strategic decision-making. This role also involves mentoring team members, collaborating with stakeholders, and ensuring data-driven solutions are effectively integrated to support business objectives and improve healthcare outcomes.

Visa Sponsorship: Not available

Key Responsibilities

Lead end-to-end training and fine-tuning of Large Language Models (LLMs), including both open-source (e.g., Qwen, LLaMA, Mistral) and closed-source (e.g., OpenAI, Gemini, Anthropic) ecosystems.
Architect and implement GraphRAG pipelines, including knowledge graph representation and retrieval for enhanced contextual grounding.
Design, train, and optimize semantic and dense vector embeddings for document understanding, search, and retrieval.
Develop semantic retrieval systems with advanced document segmentation and indexing strategies.
Build and scale distributed training environments using NCCL and InfiniBand for multi-GPU and multi-node training.
Apply reinforcement learning techniques (e.g., RLHF, RLAIF) to align model behavior with human preferences and domain-specific goals.
Collaborate with cross-functional teams to translate business needs into AI-driven solutions and deploy them in production environments.

Preferred Qualifications

PhD or Master’s degree in Computer Science, Machine Learning, or related field.
8+ years of experience in applied AI/ML, with a strong track record of delivering production-grade models.
Deep expertise in:
LLM training and fine-tuning (e.g., GPT, LLaMA, Mistral, Qwen)
Graph-based retrieval systems (GraphRAG, knowledge graphs)
Embedding models (e.g., BGE, E5, SimCSE)
Semantic search and vector databases (e.g., FAISS, Weaviate, Milvus)
Document segmentation and preprocessing (OCR, layout parsing)
Distributed training frameworks (NCCL, Horovod, DeepSpeed)
High-performance networking (InfiniBand, RDMA)
Model fusion and ensemble techniques (stacking, boosting, gating)
Optimization algorithms (Bayesian, Particle Swarm, Genetic Algorithms)
Symbolic AI and rule-based systems
Meta-learning and Mixture of Experts architectures
Reinforcement learning (e.g., RLHF, PPO, DPO)

Bonus Skills

· Experience with healthcare data and medical coding systems (e.g., CPT, CM, PCS).

· Familiarity with regulatory and compliance frameworks in AI deployment.

· Contributions to open-source AI projects or published research. And/Or ability to take research papers to poc – production.

Preferred Qualifications:

· M.S. or PhD in a computational domain

· Publication history in deep learning or statistical domain

· Experience with SQL databases and query language

· Experience in AzureML, AWS, or cluster computing architectures

· Experience with Hybrid NLP solutions that combine symbolic and machine learning approaches

· Experience with XML and XSLT

· Healthcare domain background

· IT full-stack engineering experience

Pay: A reasonable estimate of the current range is: $150,000 - $165,000. In addition, you may be eligible for a discretionary bonus for the current performance period.

Pay: $150,000.00 - $165,000.00 per year

Benefits: