About The Role
We are seeking a highly motivated and skilled AI Data Scientist to join our growing team. You will be responsible for the entire data lifecycle, from acquisition and cleaning to processing and preparation for training our cutting-edge AI models. A key focus of this role will be developing innovative synthetic data generation techniques to augment existing datasets and improve model performance. You will play a vital role in building robust and scalable data pipelines that power our AI-driven automation solutions.
Key Responsibilities
Requirements
Essential Requirements:
3+ years of experience in data science, with a focus on data preparation and feature engineering
Strong proficiency in data acquisition techniques, including web scraping, API integration, and database querying (SQL and NoSQL)
Expertise in data cleaning, normalization, and preprocessing techniques for various data types (structured, unstructured, time-series)
Proven experience in synthetic data generation techniques, including generative models (GANs, VAEs) and data augmentation methods
Solid understanding of data pipeline architectures and tools (Apache Airflow, Luigi, Prefect)
Proficiency in Python and experience with data science libraries (Pandas, NumPy, Scikit-learn)
Experience with cloud-based data storage and processing platforms (AWS S3, Google Cloud Storage, Azure Blob Storage)
Strong problem-solving and analytical skills
Preferred Qualifications
Bachelor's Degree in Computer Science, Statistics, Mathematics, or related field
Experience with big data technologies (Spark, Hadoop)
Experience with data quality monitoring and validation techniques
Experience with data governance and data privacy principles
Experience with feature stores and feature engineering platforms
Experience with data visualization tools (Tableau, Power BI)
What We Offer