For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
BuzzTrail.ai
BuzzTrail.ai

DevOps Engineer — Azure AWS

Location

Remote anywhere

Salary Estimate

N/AIconOpenNewWindows

Seniority

N/A

Tech stacks

Azure
Amazon
AI
+28

Contract role
4 days ago
Apply now

Note: You must have hard core devlop/MLOps direct experience; everything else is optional. BuzzTrail is an AI sales companion — realistic video avatars conduct live product demos with real-time voice conversations. We're early, profitable, and building the foundation to scale to hundreds of concurrent meetings. Two Nx monorepos: a main platform (React, Express, Python/FastAPI) and a RAG knowledge base. We have startup credits on AWS, Azure, and GCP — Azure is the primary cloud candidate for infrastructure, but we route AI services across all three to burn credits instead of cash.

You're not inheriting someone else's infrastructure — you're designing it from scratch. The decisions you make now become the platform a unicorn runs on.

How This Works

  • ~10 hours/week, async-first. You own the plan and the pace. No standups, no ticket grooming, no process overhead eating your hours. One 30-minute weekly sync with the CTO, everything else via async comms. You decide what to work on each week within the agreed quarterly priorities.
  • Equity conversation welcome. We want you invested in the outcome, not just billing hours.
  • Path to grow. Fractional now, with expanded scope as we scale. This can become whatever makes sense for both sides.

The Job

Phase 1: SOC 2 Foundation (Immediate Priority)

We're pursuing SOC 2 Type II certification using Vanta's Workstreet Sprint — the compliance platform and audit framework are already chosen. Your job is implementing the technical controls, not picking the tool. The infrastructure question remains: migrate to Azure, harden what we have on Railway/Supabase/Cloudflare, or some combination.

  • Infrastructure strategy— 7 production services on Railway today. Azure is the primary candidate if we consolidate (AKS/Container Apps, Azure DB for PostgreSQL, Functions, Front Door, Key Vault), but staying on current PaaS is on the table if it meets compliance. You own the recommendation.
  • IaC— whatever runs on Azure gets Terraform/Bicep. PaaS stays managed through its own tooling.
  • Audit controls— centralized logging, immutable audit trails, change management, evidence collection automation.
  • Access & encryption— least-privilege policies, MFA enforcement, key rotation for 19+ vendor API keys, secrets management (Key Vault), encryption at rest and in transit.
  • Cost optimization— maximizing credits across Azure (infra + AI), AWS (AI services), and GCP (AI services).

Phase 2: ML Data Infrastructure (Next Quarter)

Building the datasets and pipelines a future data scientist needs to fine-tune models and improve conversation quality.

  • Data collection— every voice conversation generates transcripts, RAG queries, LLM inputs/outputs, embeddings, STT/TTS events, and tool calls. Capture, structure, and store for ML training and evaluation.
  • LLM/AI service management— cost tracking, failover, and routing across providers. Credits on Azure (Azure OpenAI, Azure AI Speech, Azure AI Search), AWS (Bedrock, Transcribe, Polly), and GCP (Vertex AI, Cloud Speech/TTS) — automatic failover between clouds.
  • RAG pipeline— web scraping → chunking → embeddings → vector search. Multiple ingestion sources, namespace partitioning, hybrid search, dedup.
  • Dataset management— currently Langfuse. Whether we stay, move to Hugging Face datasets, Azure ML, or a combination is your call.
  • PII handling— anonymization, redaction, and access controls baked into the pipeline.

Phase 3: Observability & CI/CD Hardening (Ongoing)

  • CI/CD— GitHub Actions or Azure DevOps across two Nx monorepos (~40 projects): lint, typecheck, test, build, deploy. Nx affected builds, caching, deployment gates and rollback.
  • Real-time— voice agent autoscaling, video avatar lifecycle, Supabase Realtime for slide control, meeting bot recording archival.
  • Monitoring— Langfuse (LLM tracing), Grafana (operational dashboards), Sentry (error tracking). May consolidate or move to Azure Monitor depending on your infra recommendations.

You Have

  • Strong Azure experience — you know what to use and what to skip. Multi-cloud AI service routing (Bedrock, Vertex AI, Azure OpenAI) is a plus.
  • IaC (Terraform or Bicep).
  • SOC 2 Type II — at least one audit cycle, ideally built controls from scratch.
  • ML dataset infrastructure — data collection pipelines for model training, evaluation, or fine-tuning.
  • LLM/AI service management — multiple providers, cost tracking, failover, model routing.
  • CI/CD pipelines for monorepos or complex build systems.

Bonus

  • Cloudflare Workers, KV, Durable Objects.
  • Real-time voice/video infrastructure (LiveKit, WebRTC).
  • Multi-cloud AI service management (routing across Azure, AWS, GCP).
  • Azure OpenAI Service (managed endpoints, PTU provisioning, content filtering).

Roadmap (What You're Building Toward)

  • Staging environment with separate database and vendor staging keys.
  • LLM abstraction layer with failover and cost-optimized routing.
  • DSPy per-client compiled models with composite evaluation metrics.
  • Multi-framework compliance (SOC 2, ISO 27001, ISO 42001, GDPR, CCPA, EU AI Act).
  • Load testing for 100+ concurrent meetings.
  • Public REST API.

Current Stack

  • Frontend: React 19, Vite 8, TailwindCSS, HeroUI
  • Backend: Express 5, FastAPI, Hono (Cloudflare Workers)
  • Voice/Video: LiveKit (evaluating alternatives), Deepgram, ElevenLabs
  • LLM: Multiple providers (model-agnostic), DSPy, LangChain
  • RAG: Pinecone, Firecrawl, OpenAI embeddings
  • Database: Supabase (PostgreSQL + Realtime + Auth + Storage)
  • Hosting: Railway, Cloudflare, GitHub Pages
  • Monitoring: Sentry, Langfuse, Grafana
  • Build: Nx, pnpm, Vitest, Playwright, Husky

About BuzzTrail.ai

🔗Website
Visit company profileIconOpenNewWindows

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2026 Arc
Cookie PolicyPrivacy PolicyTerms of Service