Senior Software Engineer (AI Infra/Networking)

Location

Remote restrictions apply

See all remote locations

Salary Estimate

N/A

Seniority

Senior

Tech stacks

Network

Software Development

Cloud

+17

Visa

U.S. visa required

Permanent role

6 days ago

Apply now

Senior SWE – GPU/Networking/AI Infra – Up to 250K Base - Remote

This position is open to candidates working remotely in the United States or Canada.

Our client is a cloud technology company driving the next generation of AI infrastructure. They empower organizations to build and scale AI and ML solutions without the need for large in-house teams or heavy upfront infrastructure costs. Their global team of engineers works at the forefront of GPU cloud computing, supporting businesses across industries to solve complex, real-world problems.

The company operates with a flat structure, minimal bureaucracy, and a strong focus on ownership, speed, and technical excellence. Engineers work closely with customers and internal teams to design scalable solutions and influence product direction, creating direct impact on how modern AI platforms are built and operated.

The Role

They are looking for someone to build the network automation and observability systems that power a global GPU fleet. This is a hands-on engineering role at the intersection of software and network infrastructure.

You will work with cutting-edge NVIDIA hardware that most engineers never get close to, and you'll be helping design systems that often get redesigned within weeks: because that's the pace. If you thrive in environments where speed, autonomy, and real engineering ownership matter, this role is for you.

Responsibilities

Build and maintain the services and tools that keep their global network of thousands of GPU nodes running smoothly
Build tooling that sits between the network core and the cloud platform running on top
Create monitoring and alerting that gives the team clear visibility and helps resolve issues faster
Make network changes less risky through solid review processes and safeguards
Work closely with network engineers and SREs to turn day-to-day pain points into reliable internal tools

Tech & Skills Requirements

10+ years of professional software engineering experience, or equivalent practical background
Proficiency in Go, or a genuine readiness to switch; Python is also welcome
You don't need to be a network expert, but a genuine interest in infrastructure and networking is expected
Strong communication skills and the ability to work autonomously in a fast-paced, high-trust environment

Bonus Points For

Background in network engineering or SRE: someone who understands operational realities, not just code
Experience at companies operating at hyperscale: Cloudflare, major cloud providers, or similar
Familiarity with Prometheus-compatible monitoring stacks (e.g. VictoriaMetrics) or large-scale telemetry systems
Exposure to Juniper or other vendor networking equipment
Comfort debugging OSS projects and contributing fixes across languages

Interview Process