Job Title: Technical Program Manager – Observability Tools
Location: Remote
Position Type: Contract position
Job Description:
We are seeking a strategic and technically proficient Technical Program Manager – Observability Tools to lead the evaluation, rationalization, and unification of observability and monitoring platforms (e.g., Datadog, Dynatrace, AppDynamics, Splunk, Prometheus, etc.).
This role requires extensive experience in conducting tools assessments, managing RFP/RFI processes, and working cross-functionally to align technical capabilities with business objectives. The ideal candidate will drive the creation of a centralized and cost-effective observability ecosystem, enhancing visibility, performance, and operational efficiency across infrastructure, applications, and business services.
Key Responsibilities:
• Lead the end-to-end evaluation of monitoring and observability tools, comparing features, scalability, ease of integration, licensing models, and vendor support.
• Work with Procurement, Legal, Security, and Architecture teams to prepare, issue, and evaluate RFPs/RFIs for observability solutions.
• Develop and execute proof-of-concept (POC) testing plans with shortlisted vendors to validate technical capabilities against business use cases.
• Conduct TCO/ROI analysis to support decision-making for long-term platform adoption.
• Create and own the tools rationalization roadmap, phasing out redundant platforms and consolidating onto strategic solutions.
• Define and govern standard observability practices (dashboards, alerting, logging, tracing, health checks, etc.).
• Lead the migration of monitoring configurations from legacy tools to new platforms, ensuring minimal disruption and maximum value.
• Work with DevOps, SREs, Infra, and Application teams to support platform integration and best-practices adoption.
• Facilitate workshops and discovery sessions with key technical and business stakeholders to gather requirements.
• Present vendor findings and platform strategy to executive leadership and governance boards for approval.
• Collaborate with finance and procurement on contract negotiation and licensing optimization.
• Define KPIs and reporting frameworks to track platform adoption, incident resolution efficiency, and observability coverage.
Required Qualifications:
• 8+ years of experience in IT operations, DevOps, SRE, or Observability engineering.
• Proven experience leading tool selection and consolidation efforts, especially for observability, APM, and logging platforms.
• Demonstrated involvement in RFP/RFI creation, vendor evaluation, and contract negotiation processes.
• Hands-on experience with platforms like Datadog, Dynatrace, Splunk, Prometheus, AppDynamics, ELK stack, Grafana, etc.
• Strong understanding of monitoring architectures, service-level indicators (SLIs), and alerting frameworks.
• Experience with cloud-native environments (AWS, Azure, GCP) and container platforms (Kubernetes).
• Proficiency in scripting or automation (e.g., Python, Terraform, Ansible) for tool configuration and integration.
• Excellent communication and presentation skills, including experience working with C-level stakeholders.
Preferred Qualifications:
• Technical certifications in observability tools (e.g., Datadog Certified, Dynatrace Professional).
• Familiarity with ITIL, incident/problem/change management processes.
• Experience consolidating tools in large-scale enterprise environments.
• Prior involvement in Cost Optimization or FinOps initiatives related to tooling.
Thanks!