Summary
As an Ops Data Engineer, you will play a crucial role in ensuring the reliability and integrity of our data pipelines and systems on GCP. You'll be responsible for triaging, researching, and resolving data-related incidents, performing root cause analysis, and collaborating with the data engineering team to implement sustainable solutions. This role requires a strong understanding of data processes, excellent problem-solving abilities, and effective communication & collaboration skills.
Essential Duties And Responsibilities
- Triage and prioritize incoming data incident tickets, effectively communicating with Product Owner and other stakeholders to gather necessary information.
- Research and analyze data issues across various GCP services and data pipelines, identifying the root cause of discrepancies and failures.
- Resolve data incidents by implementing corrective actions, performing data corrections, and restarting/reprocessing data flows as needed.
- Collaborate with Senior Data Engineers and Data Architects to escalate complex issues and contribute to long-term solutions.
- Monitor data pipeline health and performance, proactively identifying potential issues and trends.
- Document incident details, resolutions, and contribute to a knowledge base for common data issues.
- Participate in on-call rotations to provide timely support for critical data incidents.
- Assist in the testing and validation of data pipeline changes and new deployments.
- Contribute to the continuous improvement of our data operations processes and tools.
- Perform some light SQL and Python development as needed
Qualifications: Required Education:
Bachelor’s Degree or 4 years equivalent professional experience
Required Experience
- 3-4 years of professional experience in data operations, data support, data analysis, or a similar role with exposure to data systems.
- Proficiency in SQL, with a good understanding of relational databases.
- Experience with Python for scripting and data manipulation.
- Familiarity with Google Cloud Platform (GCP) services related to data (e.g., BigQuery, Cloud Storage, Pub/Sub).
- Understanding of ETL/ELT processes and data flow concepts.
- Strong problem-solving and analytical skills with an ability to diagnose data discrepancies.
- Ability to work independently and as part of a collaborative team in a fast-paced environment.
- Good communication and interpersonal skills, with the ability to explain technical issues clearly.
- Experience with incident management and ticketing systems (e.g., Jira, ServiceNow).
- Proficiency with version control systems (e.g., Git, GitHub).
Preferred Experience:
- Experience with data monitoring and alerting tools.
- Familiarity with data quality frameworks and best practices.
- Exposure to Agile development methodologies.
Computer Skills Required:
- Strong oral and written communication skills, including the ability to document technical issues clearly.
- Strong problem-solving and troubleshooting skills with the ability to exercise mature judgment.