At Nucleus, performance is not a nice-to-have. It shapes product quality, infrastructure efficiency, and what is possible at scale.
We’re hiring a Software Engineer, Caching & Performance to design and operate the systems that make our APIs and products fast, efficient, and cost-effective. This role is focused on the deep infrastructure work behind latency, throughput, and platform responsiveness—from distributed caching layers to request optimization and performance tooling.
You will help define how Nucleus serves high-scale AI workloads with speed and discipline, ensuring that performance improvements are not one-off wins, but part of the architecture itself.
What you’ll do
- Design and build caching layers and performance infrastructure that improve latency, throughput, and cost efficiency across Nucleus products and APIs.
- Develop systems for intelligent caching of model responses, intermediate computations, metadata, and frequently accessed services.
- Analyze performance bottlenecks across backend services, storage layers, and request paths, then implement durable improvements.
- Partner with product, backend, and platform teams to optimize critical user-facing and developer-facing workflows.
- Build tooling, benchmarks, and observability systems to measure performance regressions and guide optimization work.
- Improve API responsiveness and infrastructure efficiency through better request routing, batching, concurrency, and data access patterns.
- Help define strategies for cache consistency, invalidation, precomputation, and multi-layer performance architecture.
- Contribute to reliability and incident response for high-scale infrastructure where performance and availability are tightly coupled.
What we’re looking for
- Strong experience building backend systems, platform services, or infrastructure with meaningful scale and performance requirements.
- Deep understanding of caching concepts, performance tradeoffs, and distributed systems behavior.
- Experience with technologies such as Redis, Memcached, CDNs, edge caching, or custom caching frameworks.
- Familiarity with profiling, benchmarking, and production performance analysis across services and APIs.
- Strong programming skills in languages such as Go, Java, Rust, Python, or C++.
- Experience optimizing data-intensive or latency-sensitive systems in production.
- A thoughtful approach to system design, especially where speed, correctness, and cost must all be balanced.
- Curiosity about AI systems and the unique performance challenges that emerge in model-driven products.
Why Nucleus
Nucleus is building AI products and platforms that must perform well under real-world complexity and scale. That means engineering for speed and efficiency at every layer—not as an afterthought, but as a core product capability.
In this role, you’ll work on the invisible systems that make everything feel faster, smoother, and more scalable. Your work will directly shape user experience, infrastructure economics, and the operating profile of our products.
- If you’re energized by hard performance problems and want your work to matter across an entire platform, we’d love to hear from you.