Must haves:
Python
Java or Scala
GCP (in last 6-12 months)
DS/Algo ability (they will do live coding during interviews)
What you’ll do
Develop and enhance Python frameworks and libraries to support data processing, quality, lineage, governance, analysis, and machine learning operations.
Design, build, and maintain scalable and efficient data pipelines on GCP.
Implement robust monitoring, logging, and alerting systems to ensure the reliability and stability of data infrastructure.
Build scalable batch pipelines leveraging Bigquery, Dataflow and Airflow/Composer scheduler/executor framework on Google Cloud Platform
Building data pipelines, leveraging Scala, PubSub, Akka, Dataflow on Google Cloud Platform
Design our data models for optimal storage and retrieval and to meet machine learning modeling using technologies like Bigtable and Vertex Feature Store
Contribute to shared Data Engineering tooling & standards to improve the productivity and quality of output for Data Engineers across the company
Minimum Basic Requirements
Python Expertise: Write and maintain Python frameworks and libraries to support data processing and integration tasks.
Code Management: Use Git and GitHub for source control, code reviews, and version management.
GCP Proficiency: Extensive experience working with GCP services (e.g., BigQuery, Cloud Dataflow, Pub/Sub, Cloud Storage).
Python Mastery: Proficient in Python with experience in writing, maintaining, and optimizing data processing frameworks and libraries.
Software Engineering: Strong understanding of software engineering best practices, including version control (Git), collaborative development (GitHub), code reviews, and CI/CD.
Data Management: Deep knowledge of data modeling, ETL/ELT, and data warehousing concepts.
Problem-Solving: Excellent problem-solving skills with the ability to tackle complex data engineering challenges.
Communication: Strong communication skills, including the ability to explain complex technical details to non-technical stakeholders.
Data Science Stack: Proficiency in data analysis and familiarity with tools such as Jupyter Notebook, pandas, NumPy, and other Python data analysis libraries.
Frameworks/Tools: Familiarity with machine learning and data processing tools and frameworks such as TensorFlow, Apache Spark, and scikit-learn.
Bachelor’s or Masters degree in Computer Science, Engineering, Computer Information Systems, Mathematics, Physics, or a related field or software development training program
Preferred Qualifications
Experience in Scala, Java, and/or any functional language. We code primarily in Scala, so you’ll be excited to either ramp or continue with such
Experience in microservices architecture, messaging patterns, and deployment models
Experience in API design and building robust and extendable client/server contracts
Job Type: Contract
Pay: From $60.00 per hour
Application Question(s):
Experience:
Location:
Work Location: Remote