Personal details

Abhishek A. - Remote data engineer

Abhishek A.

Based in: 🇩🇪 Germany
Timezone: Berlin (UTC+2)

Summary

I'm an Azure Data Engineer offering 7+ years of experience with a proven ability to deliver short or long-term projects in data engineering, data warehousing, machine learning, and business intelligence realm. My passion is to partner with my clients to deliver top-notch, scalable data solutions to provide immediate and lasting value. I have completed my engineering (B.Tech) from NIT RAIPUR.

I specialize in the following data solutions:

✔️ Builiding End to End ETL Pipeline Using Azure Cloud Tools.

✔️ Building the migration process from Hadoop cluster to Azure Databricks spark cluster

✔️ Building data warehouses using modern cloud platforms and technologies.

✔️ Creating and automating data pipelines, & ETL processes

✔️ Building highly intuitive, interactive dashboards.

✔️ Data Cleaning, Processing, and Machine Learning models.

✔️ Data strategy advisory & technology selection/recommendation

Technologies I most frequently work with are:

☁️ Cloud: Azure

☁️ Cloud Tools: Azure Data Factory, Azure Syanpse Analytics, Azure Databricks, Azure Data Lake, Azure Analysis Service, Azure DevOps, Azure Key Vault, Azure Active Directory.

💬Language: SQL, Python, PySpark, SparkSQL, R, SAS, Dash.

👨‍💻 Databases: SQL Server, Azure Syanpase, Azure SQL Database

⚙️ Data Integration/ETL: SAP HANA, Dyanmics 365, EPM Onyx, QAD

📊 BI/Visualization: PowerBI, Excel

🤖 Machine learning - Jupyter Notebook, Python, Pandas, Numpy, Statistics, Probablity.

Work Experience

Azure Data Engineer
ALDI SUD Germany | Feb 2022 - Present
SQL
Apache Spark
Azure SQL
Databricks
Azure Data Factory
Azure Data Engineer
ETL Pipeline for End to End Specials Use Case Squad Spearheading the migration process from Hadoop cluster to Azure Databricks spark cluster. Ensuring optimal code performance in the spark cluster and constantly seeking out ways to improve it. Industrializing the code base to facilitate seamless scaling. Taking the lead in forming the CI/CD process for both Azure Data Factory and Azure Databricks. Collaborating closely with Data Scientists to provide them with accurate data to derive actionable business insights. RETAIL HUB DASHBOARD Developed a dynamic spark notebook to pivot the web crawling data efficiently. Integrated Azure Synapse Warehouse and Data Lake with ADF and created parameterised pipeline in Data Factory for an end-to end ETL process. Designing and developing Warehouse Objects and Stored Procedures for streamlined data processing. Developing a Logic app to send daily email
Azure Data Engineer
Smiths Detection | Aug 2021 - Dec 2022
SQL
Apache Spark
Azure SQL
Databricks
Azure blob storage
Azure Data Factory
EPM ONYX BUSINESS REPORTING 1)Gathered requirements from stakeholders for Business Reporting. 2)Prepared Technical & Scope Analysis document for Data Models which includes Facts & Dimensions mapping. 3)Fetched Data from a Multidimensional source system(EPM) using API through Azure Data Factory to Built ETL Pipeline. 4)Built Full load and Delta load Pipeline in Azure Data Factory. 5)tored the raw data into Azure Data Lake in the Date Time folder structure. 6)Written transformation logics using Pyspark and SparkSQL in Databricks to convert the raw data into Facts and Dimensions. 7)Done Erwin Modelling using Erwin Studio and Implemented Semantic Data Model using Azure Analysis Services for various Analytics Power BI reports/dashboards. 8)Created a technical incident/challenge Document for effective communication across the team. MLT O2C DASHBOARD 1)Added a new report in existing order to the cash dashboard to track Market Lead Time(MLT). 2)Read the uploaded data through Logic App into Databricks from Azure Blob Storage. 3)Made required transformation in Databricks using SQL and Python and written the transformed data into curated layer of Azure Data Lake Gen2. 4)Read the Data into Azure Synapse and done Erwin Modelling using ER Studio and implemented Data Models in Azure Analysis Service for Power BI Visualization. Disaster Recovery and CI/CD Pipeline 1)Taken backups for Azure Databricks, Azure Data Factory, Azure Analysis Service using Databricks Command-Line(CLI), ARM templates, SSMS, and PowerShell. 2)Designed a CI/CD pipeline using Azure DevOps to automate the build and release processes across various environments for Azure Data Factory (ADF), Azure Synapse (Data Warehouse) & Azure Analysis Services.

Education

NIT Raipur
Bachelor's degreeComputer Science
Apr 2012 - Jun 2016

Personal Projects

Dashboarding using Azure Synapse
2023
Azure
Azure SQL
Databricks
Azure blob storage
Azure Data Factory
Azure Data Engineer
1)Orchestrated the secure migration of data from on-premise servers to Azure Data Lake using Synapse Analytics pipelines. 2)Executed data transformations using Synapse Spark Pool, efficiently writing data to Azure Data Lake. 3)Facilitated the movement of data from Azure Data Lake to Azure Synapse Dedicated Pool, optimizing data organization through stored procedures and SQL views for streamlined report generation. 4)Integrated tables into Power BI, enhancing comprehensive reporting capabilities.
Reporting Using Azure Databricks
2022
Azure
Databricks
Azure Data Factory
1)Moved the data from on-premise servers to Azure Data Lake using Azure Data Factory. 2)Proficiently processed CSV, Json, and XML files using PySpark and Spark SQL in Databricks. 3)Implemented the writing of processed files to Azure SQL using the JDBC connector and Delta Lake within Databricks. 4)Successfully managed the transition of data to the Archive container in Databricks. 5)Developed a comprehensive dashboard on the Delta Table and orchestrated the entire ETL process using Azure Data Factory.