Personal details

Marcus R. - Remote data scientist

Marcus R.

Senior Data Scientist
Based in: 🇹🇩 Canada
Timezone: Eastern Time (US & Canada) (UTC-4)

Summary

Hi! I am an experienced Data Scientist and Python programmer that works with Pandas, Numpy, Scipy, Matplotlib, Pydeck, SQLAlchemy, Snowflake, and Selenium, among other libraries, on a daily basis. I also have experience with AWS through the console and interacting with it using boto3 and awswrangler. I have a Ph.D. in Mathematical Optimization and love to solve challenging problems. Let me know how I can help you!

Work Experience

Data Scientist
FreightFlows | Oct 2021 - Present
Python
SQL
PostgreSQL
Selenium
NumPy
Pandas
Machine Learning
Statistics
Scipy
Docker
Data Science
Startups
Apache Spark
Snowflake
AWS (Amazon Web Services)

Port Polygons Project: Built port polygons by applying advanced clustering algorithms to AIS data binning. Created port anchorages and berths allowing a complete understanding of a port call. Main tools: Python, pandas, SciPy, NumPy, scikit-learn, DBSCAN, GeoPandas, h3, NetworkX, JupyterLab, awswrangler, Snowflake, PostgreSQL.

Visualizations: Created interactive visualizations of port polygons, port activity, and vessel activity. Visualizations were often used to support business decisions and sales requests. Main tools: pandas, PyDeck, Plotly, Matplotlib, JupyterLab.

Jobs Speed-Ups and AWS ECS Cost Reduction: Sped up analytics jobs by identifying their bottlenecks. Used multiprocessing to speed up parts of the code that were nonvectorizable. Significantly reduced AWS ECS costs by cutting running time.

Scraper Project: Developed web scraper to fetch metadata for vessels, ports, berths, anchorages, and berth calls. Main tools: Selenium, BeautifulSoup4, lxml, asyncio, aiohttp, awswrangler, Snowflake, Airflow, Docker.

Ballast and Laden Cutoffs Project: Applied unsupervised machine learning algorithms to calculate the vessel’s ballast and laden cutoffs. Applied KDE and Gaussian Mixture to reported draft measurements to estimate its probability distribution when the vessel is in ballast vs holding cargo. Apply Kolmogorov-Smirnov test to select the ”best” pairs of distributions. Main tools: pandas, scikit-learn, GaussianMixture, KernelDensity, SciPy.

Lead Teacher
Le Wagon | Oct 2020 - Present
SQL
NumPy
Matplotlib
Pandas
Scipy
Web Scraping
Python 3
TensorFlow
I am one of the lead teachers of the Data Science Bootcamp of Le Wagon in Rio de Janeiro.

Education

Instituto Nacional de MatemĂĄtica Pura e Aplicada
Doctor's degree・Operations Research
Mar 2013 - Aug 2017

Certifications & Awards

Big Data with PySpark
DataCamp | Aug 2023
Deep Learning Specialization
Coursera | Jan 2021