For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
Arc Exclusive
Arc Exclusive

Tech Lead Architect (AI Infrastructure & Distributed Systems) - Perm - US/EU/UK

Location

Remote restrictions apply
See all remote locations

Salary

US$120K - 180K

Min. experience

5+ years

Required skills

Software architectureDistributed Systems EngineeringGPUCUDAMicroservicesCloud

Full-time role
Posted 5 hours ago
Apply now
Actively recruiting / 3 applicants

We’re here to help you

Sole is in direct contact with the company and can answer any questions you may have. Email

SoleSole, Recruiter

Tech Lead Architect - AI Infrastructure & Distributed Systems
Full-Time · Remote or Hybrid · Founding Team

About Us

We are building next-generation AI infrastructure of ultra-fast model inference, scalable LLM hosting, evals, model routing, observability, and developer-friendly APIs.
We are looking for a Tech Lead Architect with deep experience in distributed systems, ML serving, high-performance compute, GPU cluster architecture, and cloud-scale engineering. Someone capable of defining our technical vision, designing core systems from scratch, and leading engineering as we scale.
If your experience resembles strong technical architecture, cloud infra, AI systems, large-scale compute — this role is for you.

Role Overview

As the Tech Lead Architect, you will be responsible for the foundational architecture of our AI platform. You will work hands-on while also guiding long-term technical direction and building key systems that power our platform.

This is a founding-level role with extremely high ownership.
You will lead architecture for:
● LLM inference + serving stack
● Multi-GPU orchestration, scheduling, routing
● Distributed systems for large-scale model hosting
● High-throughput, low-latency developer APIs
● Observability, logging, monitoring, evals
● Cloud infra automation and cost-efficient scaling

What You’ll Do

  1. Systems Architecture (0 → 1)
    ● Architect the overall AI serving platform (model execution engine, routing, safety, observability).
    ● Design multi-node LLM inference pipelines optimized for throughput, latency, and cost.
    ● Implement architectural frameworks that support thousands of concurrent model requests.
    ● Establish core engineering principles and technical direction.
  2. GPU / High-Performance Compute Architecture
    ● Define GPU cluster layout, scheduling strategies, sharding, and resource isolation.
    ● Optimize performance across heterogeneous GPU fleets (A100, H100, L40, 4090, etc.).
    ● Lead decisions around vLLM, TensorRT-LLM, DeepSpeed-Inference, or custom kernels.
  3. Distributed Systems & Platform Engineering
    ● Architect distributed compute layers, RPC frameworks, autoscaling, and fault tolerance.
    ● Lead decisions across data plane, control plane, orchestration, and microservices structure.
    ● Build high-availability systems that serve AI workloads reliably at scale.
  4. API Platform Architecture
    ● Design clean, developer-first APIs for inference, embeddings, fine-tuning, model mgmt.
    ● Work with product teams to define how developers interact with the platform.
    ● Architect logging, token accounting, rate limiting, and streaming protocols.
  5. Technical Leadership
    ● Make key architectural decisions that define the company’s long-term technical roadmap.
    ● Mentor and guide engineers (backend, infra, ML, frontend).
    ● Interview, hire, and help scale the engineering team.
    ● Work directly with the founders on strategy, vision, and roadmap.

What We’re Looking For

Must-Have
● 7+ years experience in software engineering, infrastructure, or systems architecture
● Strong experience with:

  • ○ Distributed systems + microservices
  • ○ GPU programming, CUDA, or model inference
  • ○ Cloud infrastructure (AWS/GCP/Azure)
  • ○ Kubernetes, Ray, or container orchestration
  • ○ High-scale backend systems and APIs
    ● Proven ability to architect large systems end to end
    ● Experience with high-performance systems (latency, throughput, batching, caching)
    ● Strong instincts around reliability, scalability, and cost-efficiency

Nice-to-Have
● Experience building or contributing to inference frameworks (vLLM, TensorRT-LLM, TGI)
● Deep understanding of LLM internals, KV cache, quantization, tensor parallelism
● Experience with data streaming, tracing, profiling, or log-based architectures
● Experience with ML training, fine-tuning pipelines, or HF ecosystem
● Startup/founding experience or appetite for zero-to-one environments
● Background in cloud cost optimization or infra financial modeling

Example Challenges You Might Work On

● Architect an entire LLM serving platform that rivals Fireworks.ai throughput
● Build distributed multi-node inference with near-linear scaling
● Design a routing layer that chooses the best model based on latency/cost/accuracy
● Optimize GPU clusters for maximum tokens per dollar
● Create a unified logging + observability system for AI workloads
● Architect a fine-tuning and evals platform integrated with the serving layer
● Build a blueprint for expanding globally across multi-region data centers

Why Join Us

● Build foundational systems for a new AI infra company
● Solve extremely hard technical problems with massive impact
● Work directly with founders who understand the tech deeply
● Define the engineering culture and architecture
● Fast execution environment with ownership over entire systems
● Competitive salary + founder-level equity

How to Apply

Please Include any examples of:
● distributed systems architecture
● LLM inference or GPU-related work
● large-scale backend or cloud systems design
● leadership roles or architecture documents you’ve written

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2025 Arc
Cookie PolicyPrivacy PolicyTerms of Service