Personal details

Saurabh A. - Remote data scientist

Saurabh A.

Based in: đŸ‡ș🇾 United States
Timezone: Eastern Time (US & Canada) (UTC-5)

About

A humble AI / ML engineer and scientist with 6+ years of success in design, development, and successful launch of client-focused solutions on cloud platforms, including experience for a Fortune 500 company Oracle, with a proven track record of delivering high-performance Objective Key Results (OKR) and Artificial Intelligence patents. PhD reseracher with experience in the execution of development roadmap, securing cross-functional stakeholder consensus, while consistently driving revenue growth as well as time and cost savings.

CAREER HIGHLIGHTS

  • Awarded for a high-speed delivery (2 years delivery with >75% performance on OKR metrics) of a scalable AI application on cloud
  • Enhanced online deep learning under imperfect time series data for simulated & real-life applications
  • Created intent focused follow-up questions for RAG pipeline of LLM based conversational product
  • Conducted prompt engineering and fine-tuning to improve output of a PyTorch based large language model

Work Experience

Lead Machine Learning Engineer (& Technical Product Manager)
Oracle | Mar 2021 - Mar 2024
Python
SQL
Leadership
Management
Coaching
  • Deployed end-to-end ML inference and data science workflows on serverless cloud based services with streamlined interaction between database, frontend, and model hosting components.
  • Managed multiple projects achieving 75% performance in project delivery metrics, by streamlining end-to-end development (requirement collection, design, sprints, UAT, code reviews, version control, containerization, CI/CD) & clear communication with cross-functional stakeholders (including User Experience team, Database management team)
  • Led pipeline deployment, output analysis, and creation of follow-up questions for RAG (Retrieval Augmented Generation) based conversational product, using AI services and vector DB on cloud
  • Drove 30% positive connotation in text data output by fine-tuning a PyTorch based GPT2 large language model
  • Engineered prompts to mitigate hallucinations in large language model by 20%+, generating data with desired patterns
  • Deployed a deep reinforcement learning product on SaaS to facilitate cutting business process costs of a banking client by 30%, via an expedited statistical analysis of their system
  • Applied transfer-learning to increase the robustness of neural network models to new task data & cut their training time by 10%+.
  • Boosted teams’ delivery efficiency by training junior engineers well, setting realistic goals for them, leading hands-on contribution by example, and conducting efficient performance reviews.
Research Assistant
University of Georgia | Aug 2016 - Feb 2021
Python
Leadership
  • Used numerical approximation & convex optimization to get 30% improvement in deep learning project for a robotics task despite noisy training data, using a real-time recurrent convolutional neural network for perceiving data
  • Delivered >90% accuracy in a web application for disaster response text analysis using data engineering, model training, & model prediction (ETL-ML) on ensemble learning xgboost model
  • Deployed a sentiment analysis (intent classification) model (Recurrent neural network and LSTM) endpoint in AWS Sagemaker while saving artifacts in S3 storage buckets. Used AWS API gateway and Lambda function enabling a frontend web application to access the Sagemaker endpoint.
  • Reduced ML training time for a recommendation system by 20%+ with distributed processing of text data via Spark cluster data frames on Google cloud

Education

University of Georgia
Doctor's degree・Computer Science
Aug 2016 - Aug 2023