Software Engineer - Distributed Systems
Rilla Network
w: rilla.network
Remote (NZ)
TL;DR
Rilla Network is seeking a Distributed Systems Engineer to build and operate our large-scale, resilient infrastructure for P2P media streaming. You'll master Kubernetes, implement MLOps, and establish comprehensive observability to ensure our network's performance, reliability, and scalability.
About Rilla Network
Rilla Network is at the forefront of revolutionising P2P media streaming. We're tackling complex combinatorial optimisation problems at scale, leveraging advanced techniques like Reinforcement Learning and graph algorithms to optimise network topology for live P2P streams. Our mission is to ensure seamless, high-performance, and resilient real-time media delivery in a distributed and decentralised environment. We are an innovative startup building core infrastructure that redefines streaming efficiency and reliability.
The Opportunity
We're looking for an experienced Distributed Systems Engineer to design, build, and operate the foundational infrastructure that powers Rilla Network. This critical role involves orchestrating services with Kubernetes, implementing robust MLOps practices for our machine learning components, and establishing deep observability to ensure system health and performance. You'll solve complex challenges related to scalability, resilience, and data pipelines in a cutting-edge decentralised environment.
What You'll Do
- Design, implement, and maintain scalable, resilient distributed systems for Rilla's network and media streaming infrastructure.
- Orchestrate and manage containerised services using Kubernetes, ensuring high availability and fault tolerance.
- Implement and optimise MLOps practices for deploying, managing, and monitoring machine learning components within our services.
- Establish and evolve comprehensive observability solutions using tools like OpenTelemetry to ensure deep system health, performance, and reliability.
- Ensure the reliability, security, and performance of our distributed architecture and data pipelines.
- Collaborate with cross-functional teams to integrate solutions into our products and optimise end-to-end system flow.
What We're Looking For
- Extensive experience designing, building, and operating large-scale distributed systems.
- Proficiency with Kubernetes and cloud-native orchestration technologies.
- Practical experience with MLOps principles and tools for production AI/ML systems.
- Demonstrated expertise in observability (monitoring, logging, tracing) with tools like OpenTelemetry.
Ideal Qualities
- Experience working with Cloud platforms like AWS, ECS/Lambdas.
- Rust, TypeScript, Python (strongly typed) experience.
- Experience working in a startup environment in a tech/team leader capacity.
- Experience working with distributed and decentralised systems at scale.
- Passion for building highly reliable and scalable infrastructure from the ground up.
- We are after people with experience building distributed systems who ideally have used Actors, Clustering, Consensus algorithms (Raft), CRDTs
Success Looks Like
You will build and maintain a distributed system that is highly available, scalable, and observable, consistently supporting our high-performance P2P media streams. Your MLOps implementations will enable efficient deployment and monitoring of our AI components, ensuring our network optimiser functions flawlessly and reliably at scale.
Career Growth
This role offers significant growth opportunities into architectural leadership, SRE specialisation, or lead positions as the distributed systems team expands. You'll be instrumental in shaping our infrastructure roadmap and adopting cutting-edge cloud-native technologies in a rapidly evolving, high-impact domain.
Why Join Us?
- Build Foundational Infrastructure: Design and implement the core distributed systems for a cutting-edge P2P network.
- Master Modern Tech: Deep dive into Kubernetes, MLOps, and OpenTelemetry in a real-world, high-scale application.
- High Impact: Directly ensure the reliability, performance, and scalability of live media streams for a growing user base.
- Startup Innovation: Contribute to a dynamic environment solving complex, never-before-solved problems.
- Equity Opportunity: Share directly in the success of a truly disruptive technology.
How to Apply
- If you're a distributed systems expert passionate about scale, reliability, and innovative infrastructure, we want to hear from you. Please apply with your resume and a brief explanation of why you're a great fit for Rilla Network.
Note: you must be in in NZ and have work right to be considered for this role.