Looking for candidates in Mexico.
Data Scientist with Tech stack: AWS, Python, SQL and DBT
Job Summary
The Senior Data Engineer role will be the technical liaison between multiple groups including a data science team, the engineering team, product management, and business stakeholders. You do not need any insurance knowledge prior, however, you must quickly dive deep into the insurance world and ask questions to become a subject matter expert. You will be responsible for building a data platform to facilitate the data science team. You must be a self-starter that can build out features such as a data pipeline from scratch. There will be support from both engineering and data science for any buildout. This is a senior level position.
Position Responsibilities
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater flexibility, etc.
- Build and maintain the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using AWS technologies, SQL, Python, Docker, and Airflow.
- Work with stakeholders including the Executive, Product, Data Science, and Engineering teams to assist with data-related technical issues and support their data infrastructure needs.
- Work with data science and analytics teams to extend data systems with greater functionality using existing infrastructure and tooling.
- Take ownership of technical project implementations from the requirements gathering stage through initial release and maintenance using a Kanban approach for tracking key milestones.
Minimum Qualifications
- 5+ years of data engineering experience from the requirements stage to production and maintenance
- Bachelor’s Degree in Computer Science or related degree or equivalent experience
- Strong experience building integrations between external systems and a Snowflake data warehouse, preferably using custom Python code to wrangle messy data sources.
- 5+ years of experience writing software in a cloud native production environment using Python.
- Experience building and maintaining cloud infrastructure, preferably with AWS cloud services: EC2, ECS, Batch, S3
- Experience with version control: git
- Experience with container technologies: Docker
- A successful history of transforming, processing and extracting value from large disconnected, datasets from a variety of data sources (Flat files, Excel, databases, APIs, etc.)
- Experience building processes supporting data transformation, data structures, metadata, dependency, and workload management.
- Experience taking hands on technical ownership of small to enterprise impactful projects and leading communications with stakeholders
- Experience working in a complex, fast-moving environment, working dynamically and collaboratively in a small team
- Strong ability to mentor, collaborate and communicate with other team members and cross functional stakeholders
Preferred Qualifications:
- Experience working with Python packages such as SQLAlchemy and Pydantic, and writing test code with pytest.
- Strong experience building, optimizing and debugging data models, pipelines and data warehousing using DBT
- Insurance industry systems and technology experience
- Experience with data pipeline and workflow management tools: Airflow, Jenkins, AWS Glue, Azkaban, Luigi, etc.
- Experience working with relational databases, strong query authoring (SQL) as well as working familiarity with a variety of databases (Redshift, MySQL, MSSQL, etc.)
- Strong analytic skills related to working with unstructured datasets.