We are in need of
Role: Data Engineer Lead
Location – Remote work / India
Mandate : Databricks and GenAI
Interested aspirants can share their CV/email their queries to anita.gokul@alphayotta.com
Job Summary
Senior Level Data Engineer / Data Analyst technical lead with data analytics experience, Databricks, Pyspark and Python
This is a key role that requires senior/lead with great communication skills who is very proactive with risk & issue management.
Experience and Education Required
10+ years of experience as Data Analyst / Data Engineer/Data Scientist with Databricks on AWS expertise in designing and implementing scalable, secure, and cost-efficient data solutions on AWS
Job Profile:
- Hands-on data analytics experience with Databricks on AWS, Pyspark and Python
- Must have prior experience with migrating a data asset to the cloud using a GenAI automation option
- Experience in migrating data from on-premises to AWS
- Expertise in developing data models, delivering data-driven insights for business solutions
- Experience in pretraining, fine-tuning, augmenting and optimizing large language models (LLMs)
- Experience in Designing and implementing database solutions, developing PySpark applications to extract, transform, and aggregate data, generating insights
- Data Collection & Integration: Identify, gather, and consolidate data from diverse sources, including internal databases and spreadsheets ensuring data integrity and relevance.
- Data Cleaning & Transformation: Apply thorough data quality checks, cleaning processes, and transformations using Python (Pandas) and SQL to prepare datasets.
- Automation & Scalability: Develop and maintain scripts that automate repetitive data preparation tasks.
- Autonomy & Proactivity: Operate with minimal supervision, demonstrating initiative in problem-solving, prioritizing tasks, and continuously improving the quality and impact of your work
Technical Skills:
- Minimum of 10 years of experience as a Data Analyst, Data Engineer, or related role, ideally with a bachelor’s degree or higher in a relevant field.
- Strong proficiency in Python (Pandas, Scikit-learn, Matplotlib) and SQL, with experience working across various data formats and sources.
- Proven ability to automate data workflows, implement code-based best practices, and maintain documentation to ensure reproducibility and scalability.
Interested aspirants can share their CV/email their queries to anita.gokul@alphayotta.com