Public Job Description
Role: Senior Reliability & Integrations Engineer
Location: Remote (Global)
Type: Full-time
The Mission
Aura is moving beyond being a standalone tool; we are building the central intelligence layer for the sales stack. We are looking for a Systems Architect to own two critical domains:
- The Golden Revenue Loop: Ensuring every booking, recording, and transcript is captured and attributed to revenue without fail.
- The Integration Ecosystem: Architecting the "Universal Adapter" that allows Aura to rapidly integrate with CRMs, Dialers, and Partner tools.
You won't just be scripting API connections one by one. You will build the Integration Engine—the standardized infrastructure for OAuth management, data normalization, and rate limiting—that allows our partnership strategy to scale from 5 integrations to 50.
The Stack
- Core: Next.js 16 (Server Actions), TypeScript (Strict), Node.js.
- Data: Supabase (PostgreSQL + RLS), Upstash Redis (Distributed State).
- Async/Queue: Inngest (Serverless background jobs for throttling & reliability).
- Observability: Sentry (Distributed Tracing, OpenTelemetry).
- Architecture: Adapter Patterns, Repository Pattern, Monorepo.
What You Will Do
- Architect the Integration Engine: Move us away from bespoke, hardcoded integrations. You will design a standardized Adapter Pattern to normalize data from Salesforce, HubSpot, and Pipedrive into Aura’s core schema.
- Partner API & Webhooks: As we launch partnerships, you will design our outbound webhooks and API surface, ensuring our partners have a reliable, secure way to consume Aura’s intelligence.
- Reliability & Resilience: Implement Circuit Breakers and Distributed Rate Limiting (via Upstash/Inngest) to ensure that if a Partner API goes down or throttles us, it never crashes our core application.
- The "Golden Path": Own the pipeline from Booking (Nylas) → Recording → Transcript → Payment (Stripe). You ensure that async events are never lost, using idempotency keys and dead letter queues to guarantee data integrity.
- Observability: Instrument the integration layer with OpenTelemetry. You should be able to look at a dashboard and see exactly which CRM integration is experiencing high latency or auth failures.
Who You Are
- An Abstractionist: You hate repeating code. When asked to build an integration, you don't just write a script; you build a framework that makes the next integration 10x faster to build.
- Async Native: You understand that 3rd-party APIs are unreliable. You architect systems that assume failure, implementing retries, backoffs, and eventual consistency by default.
- Security-First: You understand the complexities of OAuth token management (refresh rotation, scope validation) and multi-tenant data isolation (RLS).
Internal Interview Canvas
Topic: Integration Architecture (The "Ecosystem" Question)
The Question:
"We need to integrate with 5 different CRMs (Salesforce, HubSpot, etc.). Each has different rate limits, data shapes, and auth flows. How do you architect a system to handle this without writing 5 completely separate codebases?"
- ✅ Green Flags:
- Adapter Pattern: Mentions creating a standard CrmProvider interface that all integrations must implement.
- Data Normalization: Suggests converting external data to an internal 'Aura' format immediately upon ingestion (ETL).
- Unified Queuing: Discusses using a shared Job Queue (Inngest) that respects per-provider/per-tenant rate limits.
- 🚩 Red Flags:
- Suggests writing a separate folder/service for each integration with no shared logic.
- "Just hit their APIs directly from the frontend."
- Ignores data normalization (tightly coupling our DB to Salesforce's schema).
Topic: Reliability & Event Correlation
The Question:
"A Stripe 'payment_success' webhook arrives 4 days after a Zoom call took place. How do we architect the DB and logic to map that payment to the specific sales transcript automatically?"
- ✅ Green Flags: Mentions Idempotency Keys. Suggests a "pending" state or fuzzy matching via email/customer ID. Uses a background job to reconcile orphaned records.
- 🚩 Red Flags: "Just query the DB for the email immediately." (Fails to account for duplicates or timing). Thinks this must happen synchronously.
Topic: Resilience (Circuit Breaking)
The Question:
"HubSpot's API starts returning 500 errors for 10% of our requests. How does your system handle this to prevent crashing our own background jobs?"
- ✅ Green Flags:
- Circuit Breaker: Mentions "opening the circuit" (stop calling HubSpot) for a set time after X failures.
- Exponential Backoff: Retrying with increasing delays.
- Distributed State: Using Redis to track failure counts across serverless functions.
- 🚩 Red Flags: "I'll just catch the error and log it." (Doesn't prevent resource exhaustion). "I'll increase the timeout."
Topic: Security (Multi-Tenant Integrations)
The Question:
"We are processing a webhook from Salesforce for Org A. How do you ensure that data doesn't accidentally get written to Org B, even if there is a bug in the application logic?"
- ✅ Green Flags: Mentions Row Level Security (RLS). Even if the code tries to write to the wrong org, the Database Policy should reject it based on the execution context.
- 🚩 Red Flags: Relies entirely on WHERE organization_id = x in the application code. Does not mention database-level protection.