Seeking for a software engineer to join a team building high-performance AI infrastructure. In this role, you'll work on tools that extract, execute, and analyze AI/ML operators across various hardware platforms. You’ll develop systems for profiling, collecting, storing, and querying performance data from large-scale AI workloads. Your work will directly impact how next-generation AI models are deployed and optimized across platforms.
Responsibilities:
- Extract and run AI operators (e.g. PyTorch, Triton) on various hardware
- Profile performance and collect metrics (latency, memory, etc.)
- Design and maintain scalable databases to store and query performance data
- Build programmatic and web interfaces for analysis
- Integrate tools with CI pipelines and testing frameworks
- Collaborate with cross-functional teams on AI model deployment and evaluation
Required Skills:
- Strong Python programming experience
- Proficiency with PyTorch, CUDA, and Triton kernels
- Experience with performance profiling tools (e.g., Kineto, dispatcher)
- Database and SQL proficiency
- Familiarity with Linux and Bash scripting
- Exposure to LLMs (e.g., LLaMA or similar)
- Experience with CI/CD and test automation workflows