Personal details

Abhishek A.

Based in: 🇩🇪 Germany

Timezone: Berlin (UTC+2)

Summary

I'm an Azure Data Engineer offering 7+ years of experience with a proven ability to deliver short or long-term projects in data engineering, data warehousing, machine learning, and business intelligence realm. My passion is to partner with my clients to deliver top-notch, scalable data solutions to provide immediate and lasting value. I have completed my engineering (B.Tech) from NIT RAIPUR.

I specialize in the following data solutions:

✔️ Builiding End to End ETL Pipeline Using Azure Cloud Tools.

✔️ Building the migration process from Hadoop cluster to Azure Databricks spark cluster

✔️ Building data warehouses using modern cloud platforms and technologies.

✔️ Creating and automating data pipelines, & ETL processes

✔️ Building highly intuitive, interactive dashboards.

✔️ Data Cleaning, Processing, and Machine Learning models.

✔️ Data strategy advisory & technology selection/recommendation

Technologies I most frequently work with are:

☁️ Cloud: Azure

☁️ Cloud Tools: Azure Data Factory, Azure Syanpse Analytics, Azure Databricks, Azure Data Lake, Azure Analysis Service, Azure DevOps, Azure Key Vault, Azure Active Directory.

💬Language: SQL, Python, PySpark, SparkSQL, R, SAS, Dash.

👨‍💻 Databases: SQL Server, Azure Syanpase, Azure SQL Database

⚙️ Data Integration/ETL: SAP HANA, Dyanmics 365, EPM Onyx, QAD

📊 BI/Visualization: PowerBI, Excel

🤖 Machine learning - Jupyter Notebook, Python, Pandas, Numpy, Statistics, Probablity.

Technical skills

Data Modelling・8 yrs Azure Data Factory・8 yrs Databricks・8 yrs Azure SQL・8 yrs Azure blob storage・8 yrs Azure・8 yrs SQL・8 yrs Spark SQL・8 yrs Data Warehouse・8 yrs Python・8 yrs

Work Experience

Azure Data Engineer

ALDI SUD Germany | Feb 2022 - Present

SQL

Apache Spark

Azure SQL

Databricks

Azure Data Factory

Azure Data Engineer

ETL Pipeline for End to End Specials Use Case Squad Spearheading the migration process from Hadoop cluster to Azure Databricks spark cluster. Ensuring optimal code performance in the spark cluster and constantly seeking out ways to improve it. Industrializing the code base to facilitate seamless scaling. Taking the lead in forming the CI/CD process for both Azure Data Factory and Azure Databricks. Collaborating closely with Data Scientists to provide them with accurate data to derive actionable business insights. RETAIL HUB DASHBOARD Developed a dynamic spark notebook to pivot the web crawling data efficiently. Integrated Azure Synapse Warehouse and Data Lake with ADF and created parameterised pipeline in Data Factory for an end-to end ETL process. Designing and developing Warehouse Objects and Stored Procedures for streamlined data processing. Developing a Logic app to send daily email

Azure Data Engineer

Smiths Detection | Aug 2021 - Dec 2022

SQL

Apache Spark

Azure SQL

Databricks

Azure blob storage

Azure Data Factory

EPM ONYX BUSINESS REPORTING 1)Gathered requirements from stakeholders for Business Reporting. 2)Prepared Technical & Scope Analysis document for Data Models which includes Facts & Dimensions mapping. 3)Fetched Data from a Multidimensional source system(EPM) using API through Azure Data Factory to Built ETL Pipeline. 4)Built Full load and Delta load Pipeline in Azure Data Factory. 5)tored the raw data into Azure Data Lake in the Date Time folder structure. 6)Written transformation logics using Pyspark and SparkSQL in Databricks to convert the raw data into Facts and Dimensions. 7)Done Erwin Modelling using Erwin Studio and Implemented Semantic Data Model using Azure Analysis Services for various Analytics Power BI reports/dashboards. 8)Created a technical incident/challenge Document for effective communication across the team. MLT O2C DASHBOARD 1)Added a new report in existing order to the cash dashboard to track Market Lead Time(MLT). 2)Read the uploaded data through Logic App into Databricks from Azure Blob Storage. 3)Made required transformation in Databricks using SQL and Python and written the transformed data into curated layer of Azure Data Lake Gen2. 4)Read the Data into Azure Synapse and done Erwin Modelling using ER Studio and implemented Data Models in Azure Analysis Service for Power BI Visualization. Disaster Recovery and CI/CD Pipeline 1)Taken backups for Azure Databricks, Azure Data Factory, Azure Analysis Service using Databricks Command-Line(CLI), ARM templates, SSMS, and PowerShell. 2)Designed a CI/CD pipeline using Azure DevOps to automate the build and release processes across various environments for Azure Data Factory (ADF), Azure Synapse (Data Warehouse) & Azure Analysis Services.

Education

NIT Raipur

Bachelor's degree・Computer Science

Apr 2012 - Jun 2016

Personal Projects

Dashboarding using Azure Synapse

2023

Azure

Azure SQL

Databricks

Azure blob storage

Azure Data Factory

Azure Data Engineer

1)Orchestrated the secure migration of data from on-premise servers to Azure Data Lake using Synapse Analytics pipelines. 2)Executed data transformations using Synapse Spark Pool, efficiently writing data to Azure Data Lake. 3)Facilitated the movement of data from Azure Data Lake to Azure Synapse Dedicated Pool, optimizing data organization through stored procedures and SQL views for streamlined report generation. 4)Integrated tables into Power BI, enhancing comprehensive reporting capabilities.

Reporting Using Azure Databricks

2022

Azure

Databricks

Azure Data Factory

1)Moved the data from on-premise servers to Azure Data Lake using Azure Data Factory. 2)Proficiently processed CSV, Json, and XML files using PySpark and Spark SQL in Databricks. 3)Implemented the writing of processed files to Azure SQL using the JDBC connector and Delta Lake within Databricks. 4)Successfully managed the transition of data to the Archive container in Databricks. 5)Developed a comprehensive dashboard on the Delta Table and orchestrated the entire ETL process using Azure Data Factory.