Senior Data Scientist

Location

Remote restrictions apply

See all remote locations

Salary Estimate

N/A

Seniority

Senior

Tech stacks

Data

Azure

+51

Permanent role

22 days ago

Apply now

JOB DESCRIPTION

Senior Data Scientist

Speech, Voice & Conversational AI

Apply Now

Department:

Data Science & AI

Experience:

12 – 15 Years

Role Overview

We are seeking a highly experienced Senior Data Scientist – Speech, Voice & Conversational AI to lead the architecture, design, and delivery of next-generation voice and speech AI solutions. This role sits at the intersection of deep machine learning expertise and practical product engineering, driving end-to-end voice AI capabilities across Firstsource’s service lines.

The ideal candidate brings 12–15 years of progressive experience in data science with a strong specialization in speech and voice technologies, along with hands-on expertise in Generative AI, Agentic AI frameworks, and modern voice pipeline tooling. You will act as a technical thought leader, shaping our voice AI strategy, mentoring teams, and collaborating with cross-functional stakeholders to deliver production-grade solutions at scale.

Key Responsibilities

Voice & Speech AI Architecture

Design and own the end-to-end architecture for voice AI solutions including real-time speech-to-text (STT), text-to-speech (TTS), voice-to-voice, speaker diarization, emotion detection, and voice biometrics.
Evaluate, benchmark, and integrate leading speech platforms and APIs such as Google Cloud Speech, Amazon Transcribe, Azure Speech Services, Whisper (OpenAI), Deepgram, AssemblyAI, ElevenLabs, and PlayHT.
Build robust voice pipelines that handle noise cancellation, language identification, accent adaptation, and real-time streaming at production scale.

Generative AI & Agentic AI

Architect and deploy GenAI-powered conversational agents leveraging Large Language Models (LLMs) such as GPT-4, Claude, Gemini, and open-source alternatives (LLaMA, Mistral).
Design Agentic AI workflows using frameworks such as LangChain, LangGraph, CrewAI, AutoGen, and Semantic Kernel to build multi-step, tool-using voice agents.
Implement Retrieval-Augmented Generation (RAG) pipelines with vector databases (Pinecone, Weaviate, Qdrant, Chroma) for context-aware voice assistants.
Drive prompt engineering strategies and fine-tuning approaches (LoRA, QLoRA, RLHF) to optimize LLM performance for speech-centric use cases.

Solution Design & Delivery

Lead solution design workshops with clients and internal stakeholders to translate business requirements into scalable voice AI architectures.
Define technical roadmaps, establish best practices, and create reusable solution accelerators for voice and conversational AI.
Own proof-of-concept (POC) development through to production deployment, working closely with MLOps and engineering teams.

Leadership & Mentoring

Mentor and upskill a team of data scientists and ML engineers on speech AI and GenAI best practices.
Represent Firstsource as a subject-matter expert in voice AI at internal reviews, client presentations, and industry forums.
Stay current on rapidly evolving GenAI, speech, and agentic AI research and translate insights into actionable opportunities.

Technical Skills & Tooling

Domain

Required Proficiency

Speech-to-Text (STT)
Whisper, Google Cloud Speech, Azure Speech, Amazon Transcribe, Deepgram, AssemblyAI, Kaldi
Text-to-Speech (TTS)
ElevenLabs, PlayHT, Azure Neural TTS, Amazon Polly, Google WaveNet, Tortoise TTS, Bark
Voice-to-Voice
Real-time duplex pipelines, WebRTC integration, voice cloning, prosody transfer, streaming architectures
LLM & GenAI
GPT-4/4o, Claude, Gemini, LLaMA, Mistral, fine-tuning (LoRA/QLoRA), RLHF, prompt engineering
Agentic AI Frameworks
LangChain, LangGraph, CrewAI, AutoGen, Semantic Kernel, function calling, tool-use patterns
RAG & Vector DBs
Pinecone, Weaviate, Qdrant, Chroma, FAISS, embedding models, hybrid search
ML / Deep Learning
PyTorch, TensorFlow, Transformers (HuggingFace), audio feature engineering (MFCCs, spectrograms)
Cloud & MLOps
AWS / Azure / GCP, Docker, Kubernetes, MLflow, model serving (Triton, TorchServe, vLLM)
Programming
Python (advanced), SQL, familiarity with Rust/C++ for performance-critical audio processing
Telephony & Contact Center
Twilio, Genesys, Amazon Connect, SIP/VoIP protocols, CCAI (Google Contact Center AI)

Qualifications & Experience

12–15 years of progressive experience in Data Science, Machine Learning, or AI Engineering, with at least 5 years focused on speech, voice, or audio ML.
Master’s or Ph.D. in Computer Science, Electrical Engineering, Computational Linguistics, or a related quantitative discipline.
Demonstrated track record of architecting and deploying production-grade speech/voice AI systems at scale.
Deep hands-on expertise with at least two major cloud speech platforms (Google, Azure, AWS) and open-source speech models.
Strong understanding of Generative AI fundamentals including transformer architectures, attention mechanisms, tokenization, and inference optimization.
Proven experience building Agentic AI solutions with multi-step reasoning, tool use, and autonomous decision-making.
Published research, patents, or conference presentations in speech/NLP is a strong plus.

Preferred Qualifications

Experience in BPO, contact center, or customer experience transformation using voice AI.
Familiarity with speech analytics, call quality monitoring, and agent assist technologies.
Hands-on experience with voice cloning, voice conversion, and neural codec models (e.g., SoundStream, EnCodec).
Contributions to open-source speech or GenAI projects.
Experience with real-time, low-latency voice-to-voice systems for production telephony.

Why Firstsource?

At Firstsource, we make it happen. You will join a team that is deeply committed to using AI and intelligent automation to transform customer experience at global scale. This role offers the opportunity to shape the future of voice AI in one of the world’s leading business process companies, with access to real-world data, enterprise clients, and a culture that rewards bold thinking and rapid experimentation.

Firstsource is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.