As a seasoned Data Scientist and Machine Learning Engineer with extensive experience architecting and leading the development of end-to-end AI and analytics solutions, I specialize in translating complex business problems into practical, scalable products. Throughout my career, I have successfully led diverse teams, driving innovation and fostering collaboration to deliver impactful results across various industries, including automotive, retail, finance, and media.
My expertise spans developing sophisticated predictive models, advanced analytics, and real-time data solutions using state-of-the-art technologies and platforms such as AWS, Azure, Databricks, and IBM WatsonX. I excel in deploying robust machine learning models leveraging libraries and frameworks like TensorFlow, PyTorch, Prophet, XGBoost, and PySpark to deliver accurate forecasts, effective anomaly detection systems, and actionable insights.
I have a strong track record in designing comprehensive data architectures and managing data engineering workflows, from initial data ingestion through ETL processes, to visualization and deployment in production environments. My proficiency in cloud infrastructure, big data technologies, and agile methodologies ensures streamlined project execution and efficient operational management.
Passionate about innovation and committed to ethical AI, I continuously strive to build data-driven solutions that not only meet business objectives but also align with best practices and responsible use of technology.
âž” Developed object detection models using Faster R-CNN and YOLOv3 algorithms for clients including Toyota, Yaskawa, and Guelph to identify issues in robotic welding processes.
âž” Developed and deployed predictive models, such as regression analyses, for
accurate ETA estimation using PySpark, Pandas, and PyArrow on platforms like Databricks and
AWS.
âž” Managed version control via GitHub and employed Amazon SageMaker for model deployment
and execution, optimizing performance for drivers and stakeholders.
âž” Utilized Apache Airflow for job automation, ensuring process efficiency and seamless model
updates, while continually refining models, including random forest and decision tree
algorithms, for reliable and accurate ETA predictions.
âž” Utilized Python and PySpark Machine Learning models with libraries like Facebook Prophet,
N-BEATS, TensorFlow, XGBoost, and Random Forest to create sales forecasts for specific
events at Sam’s Clubs. The model considered exogenous variables such as vendor capacity and
distance between CEDIS and Clubs, using data from sales, inventory, events, promotions,
vendors, wholesale sales, e-commerce, and club geolocation stored in Azure Data Lake
Storage. Developed code in Microsoft Azure Databricks, with forecast results displayed on a
PowerBI dashboard.