For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
ZeroRisk.io
ZeroRisk.io

Software Engineer – Support & Operations

Location

Remote restrictions apply
See all remote locations

Salary Estimate

N/AIconOpenNewWindows

Seniority

N/A

Tech stacks

Java
Angular
Amazon
+28

Permanent role
a day ago
Apply now

Role Overview

The Software Engineer – Support & Operations is a full stack engineer whose primary focus is the stability, reliability, and recover-ability of a production SaaS environment. This role requires genuine engineering capability — reading and reasoning about Java and Angular codebases, navigating AWS infrastructure, and tracing a production problem from a customer symptom through logs, code, and infrastructure to its root cause.

Alongside day-to-day investigation and fix work, this engineer actively builds and maintains the internal tooling and documentation that makes the team faster and more effective over time. The right person is technically solid, methodical under pressure, and motivated by making complex problems understandable and preventable.

This role is well suited to an engineer with approximately three years of professional experience who is looking to deepen their full stack skills in a high-feedback, production-facing environment.

Key Responsibilities

Production Investigation & Bug Fixing

  • Triage and investigate production issues — querying logs, correlating events across services, and identifying root causes rather than surface symptoms.
  • Navigate both Angular front end and Java back end codebases to trace issues end to end, from a reported UI behaviour through to service logic and data layer.
  • Implement targeted code fixes for confirmed bugs, ensuring all changes are covered by tests, submitted via pull request, and reviewed before merging.
  • Escalate fixes that require architectural change or touch high-risk areas of the codebase to the Senior Engineer before proceeding.
  • Produce clear post-incident summaries covering what happened, root cause, resolution, and steps being taken to prevent recurrence.

Internal Tooling

  • Build and maintain internal tools that help the team investigate and manage recurring production issues more efficiently — log query utilities, diagnostic dashboards, automation scripts, and similar.
  • Identify manual or repetitive investigation steps that are candidates for tooling and prioritise building solutions that save meaningful time across incidents.
  • Maintain existing tooling to ensure it remains accurate and useful as the platform evolves.

Technology & Tooling Evaluation

  • Proactively research and evaluate new tools and technologies that could improve operational efficiency — observability platforms, incident management tooling, log analysis tools, and similar.
  • Produce concise assessments of candidate tools covering capability, integration effort, cost, and recommendation, sharing findings with the Senior Engineer and Staff Engineer.
  • Stay current with relevant developments in the SaaS operations and observability space.

AWS Environment

  • Navigate the AWS environment to support production investigations — reviewing logs, metrics, and infrastructure state to identify environment-level contributors to issues.
  • Work with the DevOps function where infrastructure changes are required as part of issue resolution.

Playbooks & Documentation

  • Build and maintain a library of debugging playbooks and how-to guides covering common production issues — step-by-step enough that any engineer can follow them.
  • Update playbooks after every significant incident to incorporate new learnings.
  • Identify gaps in the playbook library and prioritise filling them based on incident frequency and impact.

Essential

Skills & Experience

  • Approximately three years of professional software engineering experience with practical exposure to both Java and Angular.
  • Experience debugging issues in a cloud-hosted or SaaS environment, comfortable working with incomplete information under time pressure.
  • Working knowledge of AWS and confidence navigating cloud infrastructure for investigative purposes.
  • Strong log analysis skills — able to construct queries, correlate events across services, and draw diagnostic conclusions.
  • Clear written communication — able to explain a production issue and its resolution to both engineers and non-technical stakeholders.
  • Familiarity with Git-based workflows and standard code review practices.

Desirable

  • Experience building internal tooling or automation to support operational workflows.
  • Exposure to observability or incident management platforms such as Datadog, PagerDuty, or Grafana.
  • Understanding of relational database query analysis — able to identify slow or problematic queries as part of an investigation.
  • Familiarity with containerised deployment environments.

Ways of Working

  • Diagnose before fixing. Understanding the root cause before touching code — and documenting that understanding — is what reduces incidents over time rather than managing them indefinitely.
  • Every incident is a learning opportunity. A playbook update or post-incident note is part of the job, not an optional extra.
  • Build things that scale. Recurring manual investigation steps are a signal to build a tool, not a reason to repeat the same steps indefinitely.
  • Escalate early. If an investigation is not progressing, raise it to the Senior Engineer promptly. A delayed escalation is a worse outcome than an early one.

Reporting Structure

Reports to the Scrum Master / Team Manager. Works closely with Senior Engineers for code review and escalation of complex fixes. Escalates systemic issues to the Staff Engineer.

_This role offers a clear development path toward a Senior Engineer position for engineers who broaden their full stack and infrastructure skills, or toward a specialist Site Reliability Engineer (SRE) track for those with a stronger infrastructure and observability focus.

_

About ZeroRisk.io

🔗Website
Visit company profileIconOpenNewWindows

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2026 Arc
Cookie PolicyPrivacy PolicyTerms of Service