This is a cloud-native platform engineering role at a pivotal moment. Solvace is in the middle of a major platform modernisation — migrating from Windows Server on Elastic Beanstalk to containerised .NET 8 on Kubernetes, consolidating onto EKS, and evolving a growing microservices fleet (40+ services) that needs service mesh, production-grade observability, and deployment automation.

The AI platform (KAI) has an EKS mesh project in active development, and the broader platform migration path moves from Elastic Beanstalk to ECS and ultimately to EKS as the unified compute layer. As the agent library and microservices fleet expand, the platform needs robust Kubernetes operations. Solvace serves global manufacturing clients and runs across a broad AWS surface area — this is a deep, technically challenging infrastructure role with high visibility.

This role’s focus is building new infrastructure capabilities — Kubernetes architecture, service mesh, CI/CD automation, observability — not maintaining legacy deployment scripts. The existing application code is .NET 8 (containerised with Linux base images for the Revamp platform), and the AI platform is Python on FastAPI. The infrastructure engineer collaborates with both the platform engineering and AI teams but owns the cloud-native layer independently.

Core Technical Requirements

Kubernetes (AWS EKS) — deep, production-grade experience. Must understand cluster management, pod autoscaling (HPA/VPA), resource quotas, namespace isolation for multi-tenant workloads, health checks, rolling deployments, and troubleshooting. The target architecture consolidates the AI platform and application services on EKS as the single compute platform
Service mesh (Istio or equivalent) — experience with traffic management, mutual TLS, observability, canary deployments, and circuit breaking. The platform is a growing microservices fleet that will benefit from mesh capabilities as it scales
AWS (broad and deep) — this is an AWS-native platform. Required experience across: EKS, ECS Fargate, RDS (SQL Server + Aurora PostgreSQL), Lambda, SQS/SNS, CloudFront, ALB, S3, ECR, Cognito, Secrets Manager, OpenSearch, ElastiCache, CodePipeline, CodeBuild, Inspector, CloudWatch
Terraform — the infrastructure is fully codified in Terraform (per-API modules with ECS task + ALB + CloudFront + Security Groups + IAM + Secrets Manager, remote state with S3 + DynamoDB locking). Must be proficient at writing, reviewing, and evolving Terraform at scale
CI/CD pipelines — GitHub Actions, AWS CodeBuild, AWS CodePipeline. Experience optimising build times, implementing artefact caching/reuse, parallelising builds, and integrating quality gates (SonarCloud, load testing, E2E testing)
Containerisation — Docker, multi-stage builds, image optimisation. The platform uses Linux base images for .NET 8 containers
Observability — Datadog (current APM + RUM + dashboards), with experience in structured logging, distributed tracing, alerting, and SLO definition
Linux / CLI proficiency — this is a Linux-native role. Must be deeply comfortable with Linux systems, shell scripting, and command-line debugging

AI-Assisted Development Methodology

Solvace is transitioning towards AI-assisted development as a core engineering practice. For the DevOps/Cloud Native role:

Hands-on experience with AI coding tools — Claude Code, OpenAI Codex, GitHub Copilot, Cursor, or similar for writing Terraform modules, CI/CD configurations, and Kubernetes manifests
Spec-driven infrastructure — ability to write clear infrastructure specifications and use AI-assisted tools to generate, review, and iterate on Terraform, Helm charts, and pipeline configurations
Portfolio evidence — professional projects or side projects demonstrating AI-assisted infrastructure development. Contributions to or experimentation with emerging projects like OpenClaw are a strong signal
Automated testing and evaluation — experience building robust infrastructure testing (Terratest, conftest, OPA) and automated quality gates

Nice-to-Have

Go or Rust — valued for building custom Kubernetes operators, CLI tools, or high-performance infrastructure components
Helm / Kustomize — Kubernetes package management for the growing microservices fleet
ArgoCD or Flux — GitOps deployment patterns for Kubernetes
Python deployment — the AI engine is a FastAPI application deployed to EKS; experience with Python container deployments is a plus
.NET container pipelines — understanding .NET SDK and runtime container images, multi-stage builds, and publish profiles
Database operations — RDS management, Aurora PostgreSQL, backup/restore strategies, migration tooling
Load testing integration — experience integrating Locust, k6, or similar into CI/CD pipelines with automated thresholds
Cost optimisation — AWS cost management, right-sizing, Savings Plans, and resource budgeting
Ansible — experience with configuration management and infrastructure automation
Security / DevSecOps — secrets management best practices, container image scanning, vulnerability management

What You’ll Be Doing (First 6 Months)

Design the EKS cluster architecture — consolidate application services and AI services onto a unified, well-managed EKS platform. Design namespace isolation for multi-tenant workloads, define resource quotas per service tier (AI agents vs. application APIs vs. background workers), and implement pod autoscaling policies
Implement service mesh — as the microservices fleet grows, introduce Istio or equivalent for traffic management, mutual TLS between services, canary deployments, and distributed tracing across synchronous and asynchronous flows
Integrate testing into CI/CD — load testing scenarios and E2E test suites exist but are not yet integrated into the deployment pipeline. Build them into the CI/CD workflow with automated quality gates: performance regression detection, E2E smoke tests on deployment, and code quality enforcement
Optimise the build pipeline — implement build artefact caching/reuse, parallelise independent build stages, and reduce deployment cycle times. Audit and standardise all container definitions across the platform
Strengthen deployment safety — implement robust rollback strategies for both application deployments and database migrations. Build confidence in the deployment process for the upcoming module launches
Consolidate observability — establish a unified APM strategy, define SLOs for critical user journeys, and build dashboards that give the engineering team real-time visibility into deployment health and production performance

Research vs. Applied

Entirely applied / operational. This is a hands-on infrastructure engineering role. The person needs to be comfortable in production, pragmatic about solving real operational problems, and capable of working across the stack from Kubernetes cluster management to CI/CD pipeline design.

Why Join Us?

Modernise a production platform end-to-end — from traditional deployment to Kubernetes with service mesh. This is a complete cloud-native transformation
AI platform infrastructure — the KAI agent platform is on a planned migration path towards EKS; this person owns the infrastructure that serves production AI agents to manufacturing clients globally
Greenfield Kubernetes architecture — the EKS consolidation is early-stage; this person defines the cluster architecture, namespace strategy, deployment patterns, and service mesh configuration
Broad AWS surface area — EKS, ECS, RDS, Lambda, SQS/SNS, CloudFront, ALB, S3, Cognito, Secrets Manager, OpenSearch, ElastiCache, CodePipeline, CodeBuild, Inspector. This is a genuinely deep AWS role
Direct impact — infrastructure improvements directly improve platform performance and developer productivity. High visibility across the organisation
High visibility — the CEO and CPO recognise infrastructure as a critical investment area. This is a well-supported role, not an afterthought

About SOLVACE

🔗Website

Visit company profile

Unlock all Arc benefits!

Browse remote jobs in one place
Land interviews more quickly
Get hands-on recruiter support

PRODUCTS

Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS

About us Pricing Arc Careers - Hiring Now!Remote Junior Jobs Remote jobs Career Success Stories Talent Career Blog Arc Newsletter

JOBS BY EXPERTISE

Remote Front End Developer Jobs Remote Back End Developer Jobs Remote Full Stack Developer Jobs Remote Mobile Developer Jobs Remote Data Scientist Jobs Remote Game Developer Jobs Remote Data Engineer Jobs Remote Programming Jobs Remote Design Jobs Remote Marketing Jobs Remote Product Manager Jobs Remote Project Manager Jobs Remote Administrative Support Jobs

JOBS BY TECH STACKS

Remote AWS Developer Jobs Remote Java Developer Jobs Remote Javascript Developer Jobs Remote Python Developer Jobs Remote React Developer Jobs Remote Shopify Developer Jobs Remote SQL Developer Jobs Remote Unity Developer Jobs Remote Wordpress Developer Jobs Remote Web Development Jobs Remote Motion Graphic Jobs Remote SEO Jobs Remote AI Jobs

Cookie Policy Privacy Policy Terms of Service