Experience and Education Required
Strong experience as Data Analyst / Data Engineer/Data Scientist with Databricks on AWS expertise in designing and implementing scalable, secure, and cost-efficient data solutions on AWS
Job Profile:
· Hands-on data analytics experience with Databricks on AWS, Pyspark and Python
· Must have prior experience with migrating a data asset to the cloud using a GenAI automation option
· Experience in migrating data from on-premises to AWS
· Expertise in developing data models, delivering data-driven insights for business solutions
· Experience in pretraining, fine-tuning, augmenting and optimizing large language models (LLMs)
· Experience in Designing and implementing database solutions, developing PySpark applications to extract, transform, and aggregate data, generating insights
· Data Collection & Integration: Identify, gather, and consolidate data from diverse sources, including internal databases and spreadsheets ensuring data integrity and relevance.
· Data Cleaning & Transformation: Apply thorough data quality checks, cleaning processes, and transformations using Python (Pandas) and SQL to prepare datasets.
· Automation & Scalability: Develop and maintain scripts that automate repetitive data preparation tasks.
· Autonomy & Proactivity: Operate with minimal supervision, demonstrating initiative in problem-solving, prioritizing tasks, and continuously improving the quality and impact of your work
Technical Skills:
· Hands-on experience as a Data Analyst, Data Engineer, or related role, ideally with a bachelor’s degree or higher in a relevant field.
· Strong proficiency in Python (Pandas, Scikit-learn, Matplotlib) and SQL, with experience working across various data formats and sources.
· Proven ability to automate data workflows, implement code-based best practices, and maintain documentation to ensure reproducibility and scalability.
Behavioral Skills:
· Ability to manage in tight circumstances, very pro-active with risk & issue management
· Requirement Clarification & Communication: Interact directly with colleagues to clarify objectives, challenge assumptions.
· Documentation & Best Practices: Maintain clear, concise documentation of data workflows, coding standards, and analytical methodologies to support knowledge transfer and scalability.
· Collaboration & Stakeholder Engagement: Work closely with colleagues who provide data, raising questions about data validity, sharing insights, and co-creating solutions that address evolving needs.
· Excellent communication skills for engaging with colleagues, clarifying requirements, and conveying analytical results in a meaningful, non-technical manner.
· Demonstrated critical thinking skills, including the willingness to question assumptions, evaluate data quality, and recommend alternative approaches when necessary.
· A self-directed, resourceful problem-solver who collaborates well with others while confidently managing tasks and priorities independently.