Personal details

Nestor S. - Remote data scientist

Nestor S.

PhD student
Based in: 🇬🇧 United Kingdom
Timezone: Edinburgh (UTC+1)

Summary

Generalist with experience on all aspects of data science from formal academic research to product deployment; I have a strong background in mathematical statistics and the fundamentals of ML, as well as 3+ years of software engineering experience in industry helping companies create new products and insights from their data. I am also a self-learner always keeping up to date with innovations in the data science and MLOps spaces.

Work Experience

Statistical consultant
National Grid | Mar 2021 - May 2021
R
Statistics
Data Science
Data Visualization
Working with National Grid and other academic consultants, delivered a study on projected mid-term security of supply indices for the UK power system in a range of decarbonisation scenarios
ML consultant
Wella School Systems | Sep 2018 - Nov 2018
Python
MongoDB
Machine Learning
Statistics
developed an end-to-end machine learning pipeline for student failure prediction intended as a new data service

Education

University of Edinburgh
Doctor's degree・Statistics
Dec 2018 - May 2022
Univerity of Edinburgh
Master's degree・High Performance Computing with Data Science
Sep 2017 - Sep 2018

Personal Projects

Fitting Large-Scale Gaussian Mixtures With Accelerated Gradient DescentIconOpenNewWindows
2018
Scala
Machine Learning
Mathematics
Apache Spark
A Gaussian Mixture (GM) is a popular clustering model that is usually fitted using the Expectation-Maximization (EM) algorithm. This makes the model difficult to scale since EM is a batch algorithm, not suited for very large datasets or data streams. Taking this paper as starting point, in this project I developed a Scala library that implements accelerated stochastic gradient descent GMMs, which solves the problems mentioned above and achieves fast convergence; it can run sequentially or in parallel using Spark.
Generative deep learning models for text font generationIconOpenNewWindows
2021
Python
Docker
Google Cloud Platform
TensorFlow
Apache Beam
MLflow
Python package implementing a GCP-based end-to-end machine learning pipeline for generative deep learning models using typeface data. It uses Beam for data preprocessing, Tensorflow for model training and MLFlow for experiment tracking.

Certifications & Awards

Professional Data Engineer
Google Cloud Platform | Oct 2020
Structuring Machine Learning Projects
Coursera | Jun 2017