Personal details

Kuldeep S. - Remote data scientist

Kuldeep S.

Based in: 🇮🇳 India
Timezone: New Delhi (UTC+5.5)

Summary

Experienced Research And Development Engineer with demonstrated experience in building scalable Machine Learning Systems, Deep Learning, Natural Language Processing and Full stack development.

I breathe, eat and live software development. I've built software solutions ranging from TIc-Tac-Toe AI using reinforcement learning to web applications serving the top tiers of web traffic. Let me find the solution to your problems too.

I am also a mentor, I love teaching programming, especially programming with Python. I am ranked top 5% on Stack Overflow for answering Python, top 2% overall. I love figuring out how someone sees their world, their problems, and then help fill in the gaps. To contribute back to the community I have published open-source packages with over 10k downloads. Through mentoring, I accelerate my own constant learning, because there is always something that you want to do better when teaching that knowledge to others!

I see programming as an art. Code is expression. It needs to have clarity, purpose, elegance and efficiency to communicate well, to execute well. As a result, I produce software of the highest quality, not only functional and tested, but highly readable for future maintainers.

Work Experience

Senior Data Scientist
Lowes | Oct 2020 - Present
Python

Working on understanding 2 billion+ search queries made by our Lowe's customers on the digital platforms to personalize/recommend 5 million+ products and impact our 100 billion USD revenue annually.

• Created a Query Classification model for multi-class classification using TensorFlow, PySpark, TensorFlow Serving(TFX) & FastAPI with low latency throughput to be used in search ranking.
• Trained a domain-specific BERT to have an accurate & efficient representation of queries and documents related to the home-improvement domain on over 20 million open-source (Semantic Web 2020, Amazon 2018, etc) & in-house data points. The language model outperforms distilled BERT with 20% fewer parameters improving latency.
• Conceptualised and built a behavioural tagging model that maps attributes between products and user queries for the top most business-driving queries by mining relationships over millions of data points using PySpark.
• Worked on improving Elasticsearch query for better information retrieval & ranking using Bayesian optimization
• Worked on enriching our Elastic Index with user behaviour signals resulting in 16% better recall
• Created a 1000 times faster lexical matcher based on SymSpell to compute the importance of each token in a user query for over 3 million + unigrams and bigrams

Skills: Recommendation Systems, Natural Language Processing, Information retrieval, Deep Learning, System design
Technologies: Tensorflow, PyTorch, Transformers, BERT, Pandas, Elasticsearch, PySpark, ONNX, TF-serve, FastAPI