Personal details

Shubham S. - Remote data engineer

Shubham S.

Based in: 🇮🇳 India
Timezone: Pacific Time (US & Canada) (UTC-7)

Summary

Experienced Data Engineer, proficient in PySpark, Apache Spark, SQL, Python, and Azure services. Skilled in designing and optimizing ETL pipelines, data storage solutions, and data processing services. Certified Azure professional with a proven track record of delivering impactful solutions.

Work Experience

Data Engineer
Globant India Private Limited, Pune | Jun 2022 - Present
SQL
Apache Spark

Developed a dynamic Databricks workflow as a sole contributor, enabling trigger-based execution by backend microservices. This workflow autonomously generates delta tables and views, crucial for backend processes that create Excel workbooks containing fund information. The data source originates from transactional tables with fund audit financial data. The solution was built from the ground up, offering parallel processing capabilities and extensive configurability via JSON-controlled settings. Engineered agile pipelines with the ability to execute intricate calculations driven by microservices parameters. Optimized SQL queries, enhancing system performance by eliminating redundancies.

Data Engineer
Cognizant Technology Solutions India Ltd, Kolkata | May 2021 - Jun 2022
Apache Spark

Designed and executed ETL production pipelines using PySpark and HIVE for data extraction, decryption, reconciliation, and transformation. Developed Databricks framework for downstream predictive analytics, enabling easy access to data as HIVE tables. Enhanced PySpark notebooks to achieve 40% reduction in runtime by parallelizing data loads. Delivered adhoc notebooks for real-time data manipulation needs. Established historical data ingestion workflow and incremental orchestration for daily loads.

Education

Govt College of Engineering & Leather Technology –Kolkata
Bachelor's degree・Information Technology
Jan 2013 - Jan 2017