We’re looking for a data scientist who is passionate about Natural Language Processing (NLP), Generative AI, and traditional machine learning—and who knows how to ship high-impact, production-grade models.
This is a hands-on role where you’ll work across the full ML lifecycle: from prototyping to deployment, with a strong emphasis on production-readiness, APIs, and scalable architecture.
You’ll collaborate with AI engineers, product managers, and domain experts to develop intelligent systems that power next-generation insights for the pharma industry.
Responsibilities:
Design and develop NLP and generative AI solutions using LLM frameworks like LangChain, LlamaIndex, CrewAI, or direct model provider SDKs/APIs (e.g., OpenAI, Anthropic, HuggingFace).
Build and fine-tune traditional ML models (e.g., classification, regression, clustering) to support data-driven applications.
Create robust and scalable AI pipelines and APIs using Python and FastAPI.
Deploy models to production using AWS services such as ECS, Lambda, and S3, with attention to CI/CD, observability, and cost-effectiveness.
Apply strong system design principles to architect scalable, maintainable, and secure ML systems.
Use critical thinking to analyze complex problems, identify edge cases, and propose pragmatic, data-driven solutions.
Think creatively and outside the box to explore new ML techniques, tools, or approaches that push the boundaries of what we can do.
Work closely with cross-functional teams to turn ambiguous business problems into well-scoped, technically sound AI solutions.
Contribute to a culture of technical excellence and innovation in a fast-moving AI/ML team.
Required Experience
Experience in data science or machine learning.
Experience in NLP, LLMs, and generative AI—comfortable with both the theory and tooling.
Experience with modern LLM stacks such as LangChain, LlamaIndex, CrewAI, or similar.
Skilled in traditional ML methods using libraries like scikit-learn, XGBoost, etc.