Personal details

Mateus P. - Remote

Mateus P.

Timezone: Atlantic Time (Canada) (UTC-3)

Summary

Hello everyone!

I'm Mateus, a full-stack Data Scientist from Brazil with a background in Digital Signal Processing. I graduated from Brown University in 2018 with a Bachelor of Science in Electrical Engineering and have been working in the Data domain ever since. I have experience with the full scope of the Data Science process, from building data pipelines and developing models to evaluating A/B tests for Data Products.

I'm open to any mentorship opportunities related to data, especially when it comes to Machine Learning, Natural Language Processing, Python, and Elasticsearch.

I am currently a Data Scientist at Microsoft and have previously worked at the largest investment bank in Latin America and at Telefonica.

Throughout my career, I developed skills and projects in various segments, including Customer Segmentation, Next Best Offer models, and detection of rare events.

I have over 4 years of experience in programming with Python, Machine Learning (especially when applied to CRM and Product Analytics), and Natural Language Processing. I also have solid experience implementing data orchestration and Elasticsearch-based analytics.

Finally, I have been a mentor for both for-profit and non-profit organizations since I was 16 years old. I'm extensively trained in mentorship and tutoring and have mentored all kinds of people in many topics, from essay writing to machine learning.

Work Experience

Data Scientist
Microsoft | Sep 2021 - Present
Python
Azure
Machine Learning
Product Design
Data analytics
Data Scientist working on Analytics and experimentation for the Microsoft Stream product. My work focuses on Scorecard definition and evaluation for new flights and product iteration, as well as in-depth analyses of user behavior to drive product changes.
Data Scientist
BTG Pactual | Sep 2020 - Sep 2021
Python
Amazon S3
Machine Learning
DynamoDB
Apache Spark
Amazon Redshift
AWS Lambda
Apache Airflow
Apache NiFi
- Developed segmentation and scoring aimed at identifying the clients with the highest propensity for acquiring credit products in the BTG+ user base. - Developed an unsupervised model for customer segmentation based on credit card spending preferences in different merchant categories (MCCs). Results drive targeted campaigns; - Designed and built numerous ETL pipelines using Apache NiFi, Airflow, and Spark on AWS. Pipelines populate the BTG+ Data Lake for various purposes, including reporting, dashboards, and modeling; - Designed and implemented the architecture and pipelines for a data quality assessment framework based on Apache Spark, Airflow, and AWS Glue; - Designed, trained, and deployed a BERT-based Sentiment Analysis model for classifying news related to the Stock Market in Portuguese. The model is part of the BTG Index.

Personal Projects

2020
Python
Heroku
Pandas
Elasticsearch
Streamlit
This project is a proof of concept for data exploration of enterprise data by non-technical users using full-text search engine capabilities. It aims to illustrate a way to foster data-driven decision-making without the intervention of technical teams, a concept known as Data Self-Service. It was inspired by Looqbox, a Brazilian startup. Data Pages consists of guided access to analysis previously made available by technical teams as specifications in a data directory. The search engine capabilities provide a cleaner interface to find relevant data about the company, an alternative to list-based directories and generic dashboards.
Project Atlas - São PauoIconOpenNewWindows
2021
Geospatial Technology
Apache Spark
PostGIS
Apache sedona
A feature store project aimed at developing geospatially referenced features regarding the city of São Paulo, including features related to crime, real state, income, shopping activity, and much more. The project has been released on Kaggle and contains over 200 features at different levels of interest for use.