Data Engineering
The primary goal of the data engineering team is to identify the company's needs related to data analysis, science, and engineering in order to build a high-performance, selfservice data platform for Analytics Engineers, Data Analysts, and Data Scientists. This enables them to easily build outstanding large-scale data pipelines with ready abstractions for reliability, scalability, performance, security, quality, profiling, documentation, lineage, among other features. You will work within a unique and challenging big data ecosystem, focusing on storage efficiency, scalable and highperformance queries, extensibility, flexibility, and related challenges, with the objective of helping to better measure data quality.
Responsibilities:
• Build and maintain a high-performance data platform that meets company needs, integrates with product solutions, and drives analytical innovation, enabling exceptional engineering and efficient platforms.
• Align with stakeholders to understand their primary needs, while maintaining a holistic view of the problem and proposing extensible, scalable, and incremental solutions.
• Contribute to defining the strategic vision, crossing team and service boundaries to solve complex problems. Design, develop, and optimize frameworks and utilities for ingestion, processing, and transformation. Create comprehensive documentation of platform features and operational processes.
What we are looking for:
• Understanding of Big Data technologies, solutions, and concepts (Spark, Trino, Hive, Iceberg, Delta Lake, Hudi) and multiple languages (YAML, Python), including knowledge of integrating them effectively.
• Proficient with Databricks.
• Familiarity with Data Processing Architectures (Lambda, Kappa, Event Sourcing).
• Proficiency in Python or another major programming language, with a passion for writing clean and maintainable code. You might even be a Software Engineer with a focus or passion for data-driven solutions.
• Experience with data orchestration tools (Airflow, Dagster, Prefect).
• Understanding of the data lifecycle and related concepts such as lineage, governance, privacy, retention, anonymization, etc.
• Excellent communication skills, proactively sharing information and seeking context to effectively collaborate with various teams.
• Conduct code reviews and evaluations, advocating for software development best practices focused on engineering excellence and operational quality.
• Provide Level 2 and Level 3 support for libraries, tools, and services
Languages
• Fluent English communication skills, both written and spoken.
Location: Remote
Model: Service Contract (PJ - Independent Contractor)