For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
Arc Exclusive
Arc Exclusive

Mid-level Software Engineer (Generative AI Cloud Infrastructure) - Perm - US/UK/Europe

Location

Remote restrictions apply
See all remote locations

Salary

US$60K - 130K

Min. experience

3 - 5 years

Required skills

GolangMicroservicesKubernetesTerraformAnsible

Full-time role
Posted 8 hours ago
Apply now
Actively recruiting / 8 applicants

We’re here to help you

Sole is in direct contact with the company and can answer any questions you may have. Email

SoleSole, Recruiter

Mid-Level Software Engineer – AI Cloud & LLM Infrastructure
Full-Time · Remote or Hybrid · Founding Team Opportunity

About Us

We are building a Gen AI Acceleration Cloud an end-to-end platform for the full generative AI lifecycle. Our focus is to deliver blazing-fast LLM inference, scalable fine-tuning, and modern AI cloud infrastructure that GPUs, SmartNICs/DPUs, and ultra-fast networking fabrics.

Our platform powers mission-critical workloads with:
● On-demand & managed Kubernetes clusters
● Slurm-based training clusters
● High-performance inference services
● Distributed fine-tuning and eval pipelines
● Global data centers &heterogeneous GPU fleets
We are looking for a jr-mid Software Engineer to design, build, and scale the core systems behind our AI cloud.

What You’ll Work On

AI Cloud Infrastructure

  • Develop and maintain reliable backend services running across cloud data centers.
  • Assist in building automation for GPU management, VM provisioning, and high-throughput storage systems.
  • Contribute to distributed systems and pipelines that support AI workloads.

LLM & GPU Virtualization Platform

  • Help build the software layer for GPU clusters with modern accelerators (H100, GB200, GB300).
  • Work on GPU virtualization and management (PCIe passthrough, MIG, SR-IOV) under guidance.
  • Support scaling and optimization of storage and data systems for AI training datasets.

Observability, Reliability & Automation

  • Contribute to monitoring and observability stacks (Prometheus, Grafana, OpenTelemetry).
  • Help implement automated node lifecycle management for distributed training and inference.
  • Assist in building testing frameworks for resiliency and fault tolerance.

Core Platform Engineering

  • Contribute to internal and open-source platform components.
  • Build developer tooling, SDKs, and documentation for platform services.
  • Support research and implementation for decentralized AI workloads under senior guidance.

Requirements

  • 2–5 years of production software engineering experience.
  • Proficiency in at least one backend language (Golang preferred; Python or Rust also valued).
  • Experience contributing to distributed systems or high-performance services.

Cloud & Systems Knowledge

  • Familiarity with cloud platforms (AWS, GCP, or Azure) and distributed microservices.
  • Understanding of concurrency, memory management, and high-performance I/O.
  • Exposure to system design and reliability concepts.

Infrastructure / DevOps Skills (Plus)

  • Experience with Kubernetes, Docker, or similar container orchestration.
  • Familiarity with Terraform, Ansible, CI/CD pipelines, and monitoring tools.

Virtualization & Compute (Optional / Nice to Have)

  • Exposure to GPU virtualization, CUDA, or distributed ML training stacks.
  • Basic understanding of hypervisors or PCIe passthrough.

Networking (Optional / Nice to Have)

  • Familiarity with VLAN/VXLAN, RDMA/Infiniband, or high-performance networking concepts.

Responsibilities

  • Build and maintain backend and infrastructure components for AI workloads.
  • Collaborate with senior engineers on GPU clusters, storage systems, and virtualization platforms.
  • Assist in end-to-end service delivery from design to operation.
  • Contribute to testing frameworks and automation for reliability.
  • Work closely with cross-functional teams including ML engineers, product, and hardware teams.

Who You Are

  • A technically curious engineer who enjoys complex systems work.
  • Able to communicate ideas clearly and document work for others.
  • Motivated by building infrastructure that supports cutting-edge AI.
  • Collaborative, adaptable, and comfortable in a fast-moving startup environment.

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2026 Arc
Cookie PolicyPrivacy PolicyTerms of Service