Dice is the leading career destination for tech experts at every stage of their careers. Our client, Eclaro, is seeking the following. Apply via Dice today!
Our client is searching for a Senior Software Engineer II- AI Infrastructure to join their team.
This is a 3-month contract to hire opportunity.
Responsibilities include:
- Work with fellow teams to design, develop, and optimize the next generation of GPU infrastructure
- Work with customers and stakeholders to define and refine infrastructure requirements needed to support their AI/ML workload
- Work with infrastructure technical leaders to define infrastructure requirements to store, move, and manipulate large datasets
- Guide performance teams on industry standard testing methodologies and help optimize for GPU fabric throughput
- Identify security improvements and drive review discussions with internal teams
- Working directly with individual engineering teams to deliver new infrastructure functions and technologies in support of customer s AI/ML products
Qualifications/Skills:
- Experience delivering virtualized and/or bare metal GPU infrastructure
- Understanding of AI/ML workloads and overall industry trends
- Strong collaborator and consensus builder. Author and review design documentation.
- Experience troubleshooting, analyzing, and debugging relevant virtualization stacks (kernel, KVM, QEMU)
- Experience as a software engineer / developer in a large scale, distributed environment
- Experience writing secure, testable, and robust low-level code
- Deep understanding of operating systems, virtualization, and Linux internals
- Familiarity with related virtualization fundamentals, including networking datapath, containers, and data persistence layers
- A critical thinker dedicated to solving problems and delivering solutions
Technology Stack: Linux, LXC, Python, libvirt, KVM, QEMU, CEPH, VyOS, GPU network fabric
Software Tools: MAAS, Terraform, Chef, Elasticsearch, Git, Github Actions, GSuite, Jira, Slack, Victoria Metrics