Personal details

Yuriy M. - Remote

Yuriy M.

Timezone: Pacific Time (US & Canada) (UTC-7)

Summary

Data specialist with over 15 years of experience in data warehousing, data engineering, big data, and business intelligence. Over the years worked on 5 large data warehouses for prime internet, media, and entertainment companies and multiple Big Data systems. In addition, also acted as a hands-on big data engineer & architect, ETL developer, database administrator, provided operational support and SLA compliance.

Work Experience

Principal Consultant, Co-founder
Crowd Consulting LLC | Apr 2016 - Present
Python
SQL
Amazon RDS
Apache Spark
Apache Hadoop
AWS EMR
Snowflake
Hortonworks Data Platform
Apache Hive
Aws rredshift
Multiple project in the field of data warehousing and big data enginnering. Business development, team augmentation, mentoring, pre- and post-sales solution architecture, engineering and support
Big Data Engineer
Boston Consulting Group, GAMMA (via Toptal) | Jun 2018 - Jan 2019
Python
Pandas
Apache Spark
Airflow
Hdp
Athena
Apache Hive
Glue
2 contracts (via Toptal). Both clients – major pharmaceutical companies. Subcontracted by BCG GAMMA Advance Analytics and Data Science division to provide engineering support for BCG’s data scientists on DMP and personalization projects. Mostly Feature Engineering and ETL but also devops tasks: Python utilities, Airflow installations and Airflow administration Python scripting, Spark to Excel Python scripts, other devops tasks. Designed and build dynamic S3-to-S3 RDS-driven (metadata in Postgres) ETL system in Spark/Hive. AWS Glue is used for Hive metastore, Athena for querying and Airflow for scheduling. ETL system build based on modern Data Warehousing best practices. Documented the system and provided training. Designed and build Feature Engineering S3 Data Mart and multi-layered S3 Customer-360 Data Lake. Proposed and enforced development standards, provided documentation, data validation procedures, operational support and maintenance guidelines.

Personal Projects

Big Data Platform
2016
Python
PostgreSQL
Lambda
Rds
ETL
Amazon Redshift
Luigi
Emr
Presto
12-month contract (via Crowd Consulting). Design and deployment of full Big Data analytical ecosystem including Data Lake, Data Warehouse, Reporting Data Mart, Analytical Data Mart, BI Reporting System and comprehensive ETL system. Coded one of two subject areas completely, set up development methodology and standards. Personnel training on Big Data. Data validation, cleansing and governance. Technology selection and client’s team augmentation