Sole is in direct contact with the company and can answer any questions you may have. Email
Backend Engineer (APIs, Load Balancer, Scheduler)
Full-Time · Remote or Hybrid · High-Impact Role
Odyn is building transformative AI solutions powered by cutting-edge, high-performance
infrastructure. We're seeking a Backend Engineer to architect high-performance APIs, load balancers,
and workload schedulers for our GPU infrastructure platform.
Build Core Backend Systems & Load Balancing
● Architect RESTful and GraphQL APIs for model inference, fine-tuning, and resource
management.
● Build API gateways handling authentication, rate limiting, routing, and multi-tenant isolation.
● Develop intelligent load balancers distributing inference requests across GPU clusters based
on latency, cost, and availability with health checking and circuit breakers.
● Implement token streaming, batching, and request queuing for LLM inference workloads.
Design Workload Schedulers & Resource Orchestration
● Design GPU resource allocation schedulers with bin-packing algorithms optimizing for
topology (NVLink, PCIe, NUMA), workload type, and cost.
● Implement preemption, checkpointing, and graceful eviction for multi-tenant environments.
● Integrate with Kubernetes schedulers, custom operators, or standalone orchestration
systems.
Ensure Reliability, Performance & Collaboration
● Build observability systems (Prometheus, Grafana, OpenTelemetry) tracking API
performance, load balancer health, and scheduler efficiency.
● Define and monitor SLOs for latency, throughput, and availability.
● Partner with infrastructure, ML, and product teams to optimize systems and shape developer-
facing features.
● Participate in on-call rotation supporting production systems.
● 4–7+ years in backend/distributed systems or infrastructure engineering.
● Strong programming in Python, Go, or Rust; deep API development experience
(REST/GraphQL).
● Proven distributed systems expertise: load balancing, service discovery, failover,
microservices, message queues.
● Production cloud experience (AWS/GCP/Azure); Kubernetes and Docker proficiency.
● Strong networking fundamentals (TCP/IP, DNS, TLS, HTTP); SQL/NoSQL database skills.
● GPU-accelerated workloads or HPC scheduling experience.
● LLM inference frameworks (vLLM, TensorRT-LLM) or Kubernetes scheduling internals.
● API gateways (Kong, NGINX) or service meshes (Istio); infrastructure-as-code (Terraform,
Helm).
● Observability tools (Jaeger, Datadog); high-throughput systems (100k+ req/sec); AI
infrastructure startup experience.
● Work at the frontier of AI infrastructure building systems powering next-generation AI
applications.
● Own critical components from the ground up.
● Collaborate with world-class teams.
● Competitive compensation + remote flexibility.
If you have experience building high-performance APIs, distributed schedulers, or load balancers for
cloud infrastructure, we strongly encourage you to apply.