Personal details

Nacim A. - Remote data engineer

Nacim A.

Based in: 🇫🇷 France
Timezone: Paris (UTC+2)

Summary

I am a Lead Data Engineer and Cloud Architect specialized in Google Cloud Platform with experience in architecting and deploying high-performance Kubernetes clusters, developing ELT/ETL batch and real-time pipelines, building data collection tools.

I have certifications in Google Cloud and have published scientific papers in the field.

My technical skills include Python, SQL, and various libraries and cloud programming tools.

Work Experience

Cloud Architect | Lead Data Engineer
SFEIR Consulting - Data team | Jul 2020 - Present
Python
SQL
Batch File
NumPy
Pandas
Streaming
Google BigQuery
Docker
Google Cloud Platform
Dbi
Apache Spark
Apache Kafka
Kubernetes
Terraform
Airflow
CI/CD
AWS (Amazon Web Services)

Architected and deployed a high-performance Kubernetes cluster for large scale AI models in insurance field to estimate contracts risks. Used cloud-based analytics infrastructure to elevate scalability and model processing efficiency. Developed sales pipelines (cash receipts) to facilitate product performance and margin analysis, handling massive datasets exceeding 15 Tb per table. Employed advanced SQL and optimization techniques for efficient querying using DBT and BigQuery. Built multi-threaded tools for data collection and custom flat files exporters. Automated data extraction processes, reducing manual effort and improving data accuracy. Built ESG data ingestion at an asset management firm, developing streamlined pipelines to consolidate diverse metrics for environmental, social, and governance factors. Optimized batch processing for large datasets, implementing rigorous data quality checks to ensure reliable insights for offline analysis. Engineered real-time pipelines using Kafka (AWS Kinesis) and Spark for capturing and processing user events using ML models. Implemented fault-tolerant and scalable solutions to ensure continuous data flow. Designed and implemented ELT data pipelines to address supply chain challenges, including analyzing orders delay and conducting out-of-stock analysis. Applied data transformation techniques to prepare data for downstream analytics. Established a Customer Data Platform for marketing analysis, overseeing the entire lifecycle from data ingestion to exposition. Implemented data governance and quality measures to ensure reliable insights. Led the end-to-end process of building a team of 6+ data engineers, actively involved in the hiring process, onboarding, and upskilling initiatives. Fostered a collaborative and knowledge-sharing environment within the team.

Machine Learning Engineer
Siemens Healthineers - Intervention Guided-Therapy team | Mar 2019 - Jul 2020
Python
C++
NumPy
Pandas
TensorFlow
.NET
PyTorch

The goal is to detect prostate cancer tissues on multi-parametric MRI images. Research & Development of the cancer detection deep learning model and false positive reduction research stage. Involved in the software development C++ stage. Outcome: Sensitivity at 2 False Positives: from 79%! 90,6% 2 Papers publications: one at MICCAI 2020 and one ISBI 2020 Invention disclosure about a Dual Recursive Deep Learning architecture for false positives reduction Invention disclosure about an adaptive loss reducing wrong annotation impact FDA validation

Education

EPITA Paris - France
Master's degree・Computer Science
Sep 2017 - Jun 2019
EPITA Paris - France
Bachelor's degree・Computer Science
Sep 2014 - Jun 2017

Personal Projects

S&P 500 Data ManipulationIconOpenNewWindows
2023
Python
NumPy
Matplotlib
Pandas
Financial Market data exploratory analysis for S&P 500.
Algorithmic Quantitative Trading PlatformIconOpenNewWindows
2023
Python
NumPy
Matplotlib
Pandas
Financial Market data strategy for buy and sell signals.