Verified Job On Employer Career Site
Job Summary:
Serve Robotics is reimagining urban deliveries with their innovative sidewalk robot, aimed at enhancing delivery efficiency while alleviating street congestion. They are seeking a highly skilled ML Performance Engineer to bridge the gap between machine learning research and real-time deployment, ensuring advanced ML models operate efficiently on edge hardware.
Responsibilities:
• Own the full lifecycle of ML model deployment on robots—from handoff by the ML team to full system integration.
• Convert, optimize, and integrate trained models (e.g., PyTorch/ONNX/TensorRT) for Jetson platforms using NVIDIA tools.
• Develop and optimize CUDA kernels and pipelines for low-latency, high-throughput model inference.
• Profile and benchmark existing ML workloads using tools like Nsight, nvprof, and TensorRT profiler.
• Identify and remove compute and memory bottlenecks for real-time inference.
• Design and implement strategies for quantization, pruning, and other model compression techniques suited for edge inference.
• Ensure models are robust to the resource constraints of real-time, low-power robotic systems.
• Manage memory layout, concurrency, and scheduling for optimized GPU and CPU usage on Jetson devices.
• Build benchmarking pipelines for continuous performance evaluation on hardware-in-the-loop systems.
• Collaborate with QA and systems teams to validate model behavior in field scenarios.
• Work closely with ML researchers to influence model architectures for edge deployability and provide technical guidance on the feasibility of real-time ML models in the robotics stack.
Qualifications:
Required:
• Bachelor’s degree in Computer Science, Robotics, Electrical Engineering, or equivalent field.
• 3+ years experience in deploying ML models on embedded or edge platforms (preferably robotics).
• 2+ years of experience with CUDA, TensorRT, and other NVIDIA acceleration tools.
• Proficient in Python and C++, especially for performance-sensitive systems.
• Experience with NVIDIA Jetson (e.g., Xavier, Orin) and edge inference tools.
• Familiarity with model conversion workflows (e.g., PyTorch → ONNX → TensorRT).
Preferred:
• Master’s degree in Computer Science, Robotics, Electrical Engineering, or equivalent field.
• Experience with real-time robotics systems (e.g., ROS2, middleware, safety-critical constraints and linux embedded systems).
• Knowledge of performance tuning under thermal, power, and memory constraints on embedded devices.
• Experience with model quantization (e.g., INT8), sparsity, and latency-aware model design.
• Contributions to open-source ML or CUDA projects.
Company:
Serve Robotics is an autonomous robotic delivery company that develops AI-powered sidewalk delivery robots. Founded in 2021, the company is headquartered in Los Angeles, California, USA, with a team of 51-200 employees. The company is currently Public Company. Serve Robotics has a track record of offering H1B sponsorships.