Job Title: Cloud Tech Lead – DataBricks
Location: Location: Hybrid – 2 days/week local Client office
Contract: 4+ Months
As a Tech Lead, you will work closely with business stakeholders, solution architects, product owner, scrum master, and subject matter experts (SMEs) to understand requirements and deliver high-quality, scalable solutions on Azure Databricks.
Responsibilities Summary:
- Lead solution design and delivery for Databricks-based platforms and products (batch/streaming, BI-ready models, ML workflows).
- Translate business requirements into reference architectures, implementation plans, and sprint-ready technical stories.
- Serve as technical decision-maker for tradeoffs (cost, performance, latency, reliability, maintainability).
- Design and implement Lakehouse patterns (Bronze/Silver/Gold), medallion architecture, and domain-oriented data products where applicable.
- Build and optimize ETL/ELT using Apache Spark, Databricks SQL, and orchestrators (e.g., Workflows, ADF, Airflow).
- Implement streaming use cases (e.g., Spark Structured Streaming, Delta Live Tables where appropriate).
- Establish data modeling standards (star/snowflake, Data Vault where relevant) and performance tuning practices.
- Implement access controls, auditing, and governance with Unity Catalog (RBAC/ABAC patterns, lineage, data sharing policies).
- Ensure production readiness: CI/CD, monitoring/alerting, runbooks, incident response, and SLAs/SLOs.
- Drive data quality practices (tests, expectations, reconciliation, observability).
- Define MLOps standards (experiment tracking, reproducibility, champion/challenger, drift monitoring).
- Mentor engineers; conduct design reviews, code reviews, and set engineering standards.
- Partner with product owners, data owners, security, and platform teams; communicate status, risks, and options clearly.
- Contribute to hiring, onboarding, and capability building.
Position Requirements:
- 7+ years in data/platform/analytics engineering, including 2+ years leading technical teams or workstreams.
- Proven production delivery on Databricks (with strong Azure Databricks experience preferred).
- Strong Apache Spark expertise (PySpark/Scala): distributed processing, troubleshooting, and performance tuning.
- Deep Delta Lake knowledge: ACID tables, compaction, Z-Ordering, schema evolution, and batch/streaming patterns.
- Experience building scalable batch and streaming pipelines, including orchestration/operationalization (scheduling, dependencies, retries, idempotency).
- Strong Azure data platform background, including Azure Data Lake Storage (ADLS) architecture and best practices; familiarity with Azure Data Factory (ADF), Azure Synapse Analytics, and related services.
- Advanced programming skills in Python and SQL, plus hands-on Java experience (e.g., integrations/services or Spark/platform utilities).
- Cloud fundamentals in at least one major platform (Azure/AWS/Google Cloud Platform (GCP)): identity and access management (IAM), storage, networking basics, and cost controls.
- Working experience in Java (e.g., building integrations/services, Spark/streaming components, or platform utilities).