Actively recruiting / 4 applicants
We’re here to help you
Juliana Torrisi is in direct contact with the company and can answer any questions you may have. Email
Juliana Torrisi, RecruiterRole Overview
We are seeking a skilled Machine Learning Engineer with a focus on systems and infrastructure engineering to support our embodied AI research and production systems. You will be responsible for building and maintaining the robust, high-performance infrastructure that underpins our robotics software stack, data infrastructure, and machine learning training platform.
Responsibilities
- Robotics Systems Software: Design and implement low-level robotics services, real-time control protocols, and sensor integration layers. Work directly with hardware to ensure deterministic, high-throughput performance on Linux-based robotics platforms.
- ML Platform & DevOps: Architect and operate our training infrastructure, including Kubernetes-based HPC clusters, multi-tenant GPU orchestration, distributed training job scheduling, and model deployment pipelines. Transition research code to production readiness and maintain efficient compute operations at scale.
- AI Training Automation: Convert research prototypes into automated, monitored, and reproducible training pipelines. Manage experiments, checkpoints, artifacts, and build tooling for rapid research progress while ensuring production reliability.
- Data Storage: Develop and maintain large-scale data ingestion systems to capture and track multimodal robotics data, ensuring reliability, versioning, and reproducibility across terabytes of data.
- Data Pre-processing: Design pipelines for preprocessing multimodal robotics data to feed into training systems, handling data transformation, normalization, and quality assurance to provide clean inputs for model training.
Required Skills
- Proficiency in at least one systems language such as C, C++, or Rust, and fluency in Python.
- Extensive experience with Linux systems programming, POSIX APIs, and handling real-time constraints.
- Proven experience in building data pipelines or infrastructure at scale.
- Familiarity with Kubernetes, distributed systems, and HPC environments.
Nice to Have
- Knowledge of JavaScript or Go.
- Ability to bridge the gap between algorithms and production systems.
- Strong debugging skills across various layers including firmware, OS, networking, and applications.
- A keen interest in working within a fast-paced research environment that supports breakthroughs in robotic intelligence.