What you will be doing:
- Design, build, and operate reliable ETL/ELT and ingestion pipelines moving data from transactional systems into analytics/AI-ready platforms.
- Improve end-to-end data-lake foundations: storage layout, partitioning, schema evolution/versioning, lineage, cataloging, and delta synchronization.
- Build and operate event-driven data flows that power real-time integrations and AI agent orchestration.
- Help scale retrieval workflows (vector storage/indexing, embedding pipelines, RAG-adjacent data flows) that support production-grade AI capabilities.
- Strengthen reliability across services and pipelines: retries, backoff, DLQs, idempotency, reconciliation, and operational observability.
- Lead pragmatic modernization: reduce accidental coupling between business logic and infrastructure, improve contracts, and make systems easier to run locally and operate in production.
- Partner across Platform/AI/DevOps/Product; lead proof-of-concepts and translate results into durable platform capabilities.
- Participate in an on-call rotation and drive post-incident improvements (post-mortems, root cause analysis, and prevention).
You’re a great fit if you sound like one of these profiles
Profile A (most common): Backend/platform engineer who is strong in JVM distributed systems and has shipped real data workflows (Spark/EMR/Glue exposure).
Profile B: Data engineer who has built pipelines and is comfortable owning services, async messaging semantics, and production operations—not only transformations.
What you’ll bring:
- Strong software engineering fundamentals: designing maintainable, testable systems and owning features end-to-end.
- Production experience with distributed systems: async workflows, failure modes, retries, and eventual consistency.
- Hands-on experience building and owning ETL/ELT pipelines, including ingestion from OLTP sources into a data lake.
- Experience operating data systems in production: monitoring, incident response, and continuous improvement.
- Cloud experience on AWS building production systems (not just using services): storage + messaging + orchestration.
- Strong collaboration and communication; ability to mentor and raise engineering maturity through reviews and design discussions.
- Strong English language communication and collaboration skills
Strongly preferred (high-signal)
- JVM-first data processing experience (Java/Kotlin/Scala) with Spark-based workloads.
- Experience with schema evolution and data contracts (versioning strategies, backfills, compatibility).
- Operational ownership of pipeline reliability: replay safety, DLQ patterns, reconciliation, lineage thinking.
- IaC experience (CDK preferred; CloudFormation/Terraform acceptable).
The Tech Stack You’ll Work With:
- JVM services (Java 21+ / Spring microservices) and some Python.
- AWS: EKS, Lambda; storage/messaging/catalog primitives (S3, DynamoDB, SNS/SQS, Lake Formation, Glue Catalog).
- Search/retrieval: OpenSearch Serverless and related vector storage/retrieval components.
- Tooling: GitHub/GitHub Actions, Nx monorepo, Jira/Confluence.
Why this role exists
Caseware is evolving Caseware Cloud to deliver intelligent, data-driven experiences—powering analytics, automation, and AI/agentic capabilities on top of a modern data platform.
This role is for someone who can bridge transactional backend systems and data-intensive distributed workflows. You’ll work on systems that combine:
- APIs and domain services (microservices, relational modeling, service boundaries)
- Asynchronous workflows (messaging, retries, idempotency, replay safety)
- Distributed/batch data processing (Spark-based processing and lake patterns)
- Cloud platform primitives (AWS orchestration and managed services)
- AI-ready retrieval workflows (embedding + vector retrieval pipelines)
What success looks like (first 6–12 months)
- Improved reliability and operability of ingestion + async workflows (clearer idempotency/replay patterns, fewer recurring incidents).
- Cleaner boundaries between orchestration/control-plane concerns and data-processing execution concerns.
- Better observability across APIs, queues, workflows, and distributed jobs.
- Clearer data contracts and more predictable schema evolution practices.
- Tangible improvements in developer experience (local run, testing, reduced “environment-only” hacks).
Perks & Benefits
- ¨Contrato a termino Indefinido¨ with all the legal benefits
- Prepaid Medicine
- Life insurance and funeral assistance
- Internet allowance
- Home office stipend
- Competitive compensation — above the market average
- 100% remote work environment and an excellent work-life balance
- Opportunity to work for a growing global SaaS leader company
- A culture that promotes independence, innovation, trust, and accountability
- Open space to be creative, innovative and strategize for the future
- Mentorship by highly experienced professional
- Budget for training, we want you to grow
- 5 Personal Time Off days per year
- Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90.
- Recognition Award, additional paid time off in recognition of the corresponding year of service
- Upgrade vacation starting at 5 years of service