For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
Arc Exclusive
Arc Exclusive

AWS DevOps Engineer – AI Benchmarking (FT-WW

Location

Remote anywhere

Hourly rate

Hourly rate

Min. experience

5+ years

Hours per week

40 hours

Duration

5 weeks

Required skills

AWSInfrastructure as CodeTerraformPythonDevOps

Language requirement

English・Professional

Freelance job
Posted 10 days ago
Apply now
Actively recruiting / 33 applicants

We’re here to help you

Juliana Torrisi is in direct contact with the company and can answer any questions you may have. Email

Juliana TorrisiJuliana Torrisi, Recruiter

Role Overview

We are seeking talented AWS DevOps Engineers to join our team in building a large-scale benchmark that tests the limits of leading AI models. This role is distinct from traditional DevOps positions as it focuses on designing complex, adversarial cloud infrastructure tasks. These tasks will challenge AI agents to solve difficult AWS problems with precision, security, and reliability.

Responsibilities

  • Design intricate and adversarial AWS infrastructure tasks to evaluate AI capabilities.
  • Develop realistic DevOps scenarios that include security constraints, dependencies, edge cases, and failure conditions.
  • Write precise task specifications that define desired infrastructure outcomes.
  • Create idempotent reference solutions using Terraform or other AWS Infrastructure-as-Code tools.
  • Develop automated graders and validation scripts in Python to assess AI agent performance.
  • Validate infrastructure through AWS APIs, CLI outputs, Terraform state, and system behavior.
  • Craft tasks that detect incomplete, unsafe, or superficially correct solutions.
  • Ensure tasks are challenging enough to thoroughly test advanced AI models.
  • Review and quality-check tasks from other engineers for accuracy and difficulty.
  • Document task intent, assumptions, expected outcomes, edge cases, and scoring rationale.

Required Skills

  • 4+ years of experience in DevOps, cloud infrastructure, platform engineering, or site reliability engineering.
  • Extensive hands-on AWS experience across networking, IAM, compute, storage, databases, security, and cloud operations.
  • Advanced expertise in Infrastructure-as-Code, particularly with Terraform.
  • Proficient in Python for building automated graders and validation tools.
  • Experience in testing and validating infrastructure through APIs, CLIs, or automated frameworks.
  • Strong understanding of secure, reliable, and reproducible infrastructure.
  • Ability to design complex technical problems with ambiguity and edge cases.
  • Attention to detail in identifying unsafe or incomplete solutions.
  • Excellent written communication and technical documentation skills.
  • Ability to work independently in a structured, task-based environment.

Nice to Have

  • Experience with Pulumi, AWS CDK, or AWS CloudFormation.
  • Familiarity with boto3, pytest, Terratest, LocalStack, or similar tools.
  • Background in security engineering, chaos engineering, incident response, or SRE.
  • Experience with AI evaluation, benchmarking, or red teaming.
  • Knowledge of AI coding agents and common AI model failure modes.
  • Experience designing technical assessments or infrastructure exercises.
  • Experience reviewing engineers' work and maintaining quality standards.

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2026 Arc
Cookie PolicyPrivacy PolicyTerms of Service