For companies
  • Hire developers
  • Hire designers
  • Hire marketers
  • Hire product managers
  • Hire project managers
  • Hire assistants
  • How Arc works
  • How much can you save?
  • Case studies
  • Pricing
    • Remote dev salary explorer
    • Freelance developer rate explorer
    • Job description templates
    • Interview questions
    • Remote work FAQs
    • Team bonding playbooks
    • Employer blog
For talent
  • Overview
  • Remote jobs
  • Remote companies
    • Resume builder and guide
    • Talent career blog
ClearScale
ClearScale

Site Reliability Engineer

Location

Remote restrictions apply
See all remote locations

Salary Estimate

N/AIconOpenNewWindows

Seniority

N/A

Tech stacks

Cloud
Amazon
DevOps
+17

Permanent role
18 hours ago
Apply now

Site Reliability Engineer

Drive Innovation and Transformation with ClearScale's Cloud Expertise:

ClearScale, a leading AWS Premier Consulting Partner, empowers businesses to unlock the full potential of the cloud through a wide range of services, including cloud consulting, architecture design, migration, automation, application development, and managed services. We help Fortune 500 enterprises, mid-sized businesses, 1 and startups across diverse industries like Healthcare, Finance, and Technology succeed with ambitious and transformative cloud projects. Our expertise lies in architecting, developing, and launching innovative and sophisticated solutions using the latest cutting-edge cloud technologies. Due to our continued growth and the increasing demand for our modernization and cloud-native development capabilities, we are seeking a talented and experienced AWS Hosted/Modernization Software Engineer to join our dynamic team. If you are passionate about building and modernizing applications on the AWS platform, tackling complex engineering challenges, and working with a team of top-tier cloud experts, this is your opportunity to make a significant impact.

What You'll Do:

  • Execute on Observability Strategy
  • Define and document standards for logging, tracing and SLO definitions for engineering teams to follow
  • Propose effective ways to manage dashboards, traces, monitors, metrics and logs in Datadog
  • Integrate Datadog with incident management tools and Slack
  • Establish comprehensive monitoring using Datadog
  • Centralize logging and developing mechanisms for efficient debugging
  • Implementing systems for distributed tracing visualization
  • Adopting OpenTelemetry standards across microservices
  • Rolling out observability to development and production environments in close collaboration with engineering and operations teams
  • Define training practices for engineering teams to adopt observability standards and operational practises for healthy and sustainable incident management processes
  • Implementing POCs and demonstrating such constructs to engineering teams
  • Introduce engineering practices for healthy alerting mechanisms, dashboard definitions and blind-spots elimination with a focus on eliminating alert fatigue
  • Establish near real time reporting to minimize MTTA and MTTR and improve developer experience

What You'll Bring:

  • Extensive experience with AWS infrastructure at scale
  • Experience working in SRE, DevOps or Developer Experience teams in engineering organizations is a must
  • Deep knowledge of observability tooling (Datadog, Grafana, Splunk, OTEL) and hands-on experience developing, extending and operating them across different environments including high-loaded production systems
  • Expert knowledge of Terraform
  • Ability to propose solutions that scales across engineering teams and balance speed of response and cognitive load
  • Experience leading incident responses utilizing operational tools including logging, tracing, SLO patterns and synthetics
  • Experience establishing technical roadmaps from operational strategies for SRE, DevOps or Developer Experience teams in mid to large sized organizations and ability to drive its adoption in the engineering teams
  • Experience applying analytical practices to define SLAs in close coordination with engineering teams and stakeholders
  • Deep understanding and experience advocating for and rolling out SRE best practices and standards for engineering teams
  • Mindset of "minimal tooling for maximum impact"
  • Experience with on-call rotations, creating and executing scalable practices in engineering teams
  • Experience with integrating observability tooling with Teams and Slack
  • Leadership skills to drive alignment between different departments and get buy-in from different stakeholders
  • Exemplary oral and writing skills for technical and non-technical stakeholders
  • AWS certifications are a plus

Our Commitment to Your Growth and Well-being:

  • Competitive salary
  • Exceptional opportunities for career growth and leadership development within a leading AWS Premier Consulting Partner.
  • A collaborative, high-energy, and fully remote work culture that fosters connection and innovation.
  • Continuous learning and development opportunities, including access to training and certifications.
  • The flexibility and convenience of a 100% distributed workforce – work from the location that suits you best!

About ClearScale

👥201-500
📍San Francisco, CA
🔗Website

ClearScale Service

ClearScale product / service
ClearScale product / service
ClearScale product / service
ClearScale product / service
ClearScale product / service

How does ClearScale work?

The company designs, builds, integrates, and manages complex infrastructures and applications on AWS exclusively. ClearScale has successfully delivered more than 1,000 cloud projects for clients ranging from startups to large enterprises and public sector organizations. Our core competency is delivering custom cloud projects and services for clients who have limited cloud experience on staff or who need additional resources. We leverage the best cloud technology available to provide a solution that is unique to your project requirements. Whether this is your first project in the cloud or one of many, ClearScale has the expertise to handle your most complex requirements.

Company culture

Operational Efficiency

Leverage automation, standardize deployments, and reduce your overall cloud costs.

Business Agility

Develop, deploy, and scale new offerings faster to stay ahead of market trends.

Visit company profileIconOpenNewWindows

Unlock all Arc benefits!

  • Browse remote jobs in one place
  • Land interviews more quickly
  • Get hands-on recruiter support
PRODUCTS
Arc

The remote career platform for talent

Codementor

Find a mentor to help you in real time

LINKS
About usPricingArc Careers - Hiring Now!Remote Junior JobsRemote jobsCareer Success StoriesTalent Career BlogArc Newsletter
JOBS BY EXPERTISE
Remote Front End Developer JobsRemote Back End Developer JobsRemote Full Stack Developer JobsRemote Mobile Developer JobsRemote Data Scientist JobsRemote Game Developer JobsRemote Data Engineer JobsRemote Programming JobsRemote Design JobsRemote Marketing JobsRemote Product Manager JobsRemote Project Manager JobsRemote Administrative Support Jobs
JOBS BY TECH STACKS
Remote AWS Developer JobsRemote Java Developer JobsRemote Javascript Developer JobsRemote Python Developer JobsRemote React Developer JobsRemote Shopify Developer JobsRemote SQL Developer JobsRemote Unity Developer JobsRemote Wordpress Developer JobsRemote Web Development JobsRemote Motion Graphic JobsRemote SEO JobsRemote AI Jobs
© Copyright 2025 Arc
Cookie PolicyPrivacy PolicyTerms of Service