Staff Software Engineer - LLM Inference

Location

Remote anywhere

Salary Estimate

N/A

Seniority

Staff

Tech stacks

Software Development

Kubernetes

+20

Permanent role

23 days ago

Apply now

Senior/Staff Software Engineer - LLM Inference

Salary: €70K – €120K

Fully Remote Globally

Apollo Solutions have proudly partnered with an early stage AI start-up backed by top venture capital. They are building AI that’s predictable and production-ready. Their remote-first team ships open-source tools for novel technology with strong community adoption, and we’re well funded to move fast.

The role

Own step-function improvements in LLM inference for structured outputs. This is hands-on systems work where millisecond wins matter-cut latency, raise throughput, and drive down cost across real workloads.

What you’ll tackle

Push and tune inference stacks (e.g., vLLM, SGLang, TensorRT) to unlock meaningful performance gains.
Build single-node, multi-GPU pipelines; optimize communication with NCCL.
Profile kernels and memory to remove bottlenecks and variance.
Make structured generation fast, reliable, and easy to integrate across services and OSS.
Harden deployments: observability, auto-scaling, fault tolerance, safe rollouts.
Share learnings through docs, examples, and upstream contributions.

You’ll thrive here if you have

Proven experience operating or extending inference engines (vLLM/SGLang/TensorRT).
Distributed inference chops (multi-GPU on one host) and low-latency comms (NCCL).
Hands-on NVIDIA GPU knowledge (CUDA, SMs, memory hierarchy).
A record of measurable wins (e.g., 20%+ throughput from kernel/runtime optimizations).
LLM MLOps background (monitoring, scaling, resilience for inference services).
Strong Python; Rust curiosity or experience.
Comfort with Docker, Kubernetes, and Linux internals.

Why this team

Real frontier work: Structured generation is early-innovation is the default.
Remote-first: Work from anywhere. Clear writing, intentional meetings.
Fair package: Market-aligned comp for an early-stage startup + equity, health benefits, retirement plan (where applicable), and the hardware you need (GPUs included).

If you're interested, please apply now!

About Apollo Solutions

🔗Website

Visit company profile

Unlock all Arc benefits!

Browse remote jobs in one place
Land interviews more quickly
Get hands-on recruiter support

PRODUCTS

Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS

About us Pricing Arc Careers - Hiring Now!Remote Junior Jobs Remote jobs Career Success Stories Talent Career Blog Arc Newsletter

JOBS BY EXPERTISE

Remote Front End Developer Jobs Remote Back End Developer Jobs Remote Full Stack Developer Jobs Remote Mobile Developer Jobs Remote Data Scientist Jobs Remote Game Developer Jobs Remote Data Engineer Jobs Remote Programming Jobs Remote Design Jobs Remote Marketing Jobs Remote Product Manager Jobs Remote Project Manager Jobs Remote Administrative Support Jobs

JOBS BY TECH STACKS

Remote AWS Developer Jobs Remote Java Developer Jobs Remote Javascript Developer Jobs Remote Python Developer Jobs Remote React Developer Jobs Remote Shopify Developer Jobs Remote SQL Developer Jobs Remote Unity Developer Jobs Remote Wordpress Developer Jobs Remote Web Development Jobs Remote Motion Graphic Jobs Remote SEO Jobs Remote AI Jobs

Cookie Policy Privacy Policy Terms of Service