Job Description
REQUIREMENTS:
- Skill Must have-Cloud, Terraform ,K8s, Helm,One tool among -NewRelic , Datadog,Dynatrace. Experience of configuring, creating integrations in either of NewRelic , Datadog,Dynatrace.
- Hands-on expertise in at least one observability tool (Datadog, Dynatrace, NewRelic).
- Strong knowledge of Infra, APM, Logs, Metrics, Traces, Dashboards.
- Worked on any other observability tools,Observability-as-Code (Terraform),Exposure to AIOps (anomaly detection, predictive insights),SRE
- Integrate observability practices to improve system reliability and performance.
- Collaborate with observability engineers to ensure the development and implementation of robust monitoring solutions.
- Enhance system reliability by designing and implementing automated monitoring and alerting processes.
- Bring experience from system administration, DevOps, or Site Reliability Engineering (SRE) roles.
- Expert on observability tools like NewRelic / Datadog / Dynatrace
- Good knowledge of system architecture, infrastructure as code (Terraform, Ansible), and cloud environments.
- Have hands-on experience with CI/CD pipelines, tools like Jenkins, GitLab.
- Demonstrate expertise in incident management, root cause analysis, and the use of observability tools.
- Strong in containerization and orchestration technologies like Docker and Kubernetes.
RESPONSIBILITIES:
- Understanding the clients business use cases and technical requirements and be able to convert them into technical design which elegantly meets the requirements.
- Mapping decisions with requirements and be able to translate the same to developers.
- Identifying different solutions and being able to narrow down the best option that meets the client’s requirements.
- Defining guidelines and benchmarks for NFR considerations during project implementation
- Writing and reviewing design document explaining overall architecture, framework, and high-level design of the application for the developers
- Reviewing architecture and design on various aspects like extensibility, scalability, security, design patterns, user experience, NFRs, etc., and ensure that all relevant best practices are followed.
- Developing and designing the overall solution for defined functional and non-functional requirements; and defining technologies, patterns, and frameworks to materialize it
- Understanding and relating technology integration scenarios and applying these learnings in projects
- Resolving issues that are raised during code/review, through exhaustive systematic analysis of the root cause, and being able to justify the decision taken.
- Carrying out POCs to make sure that suggested design/technologies meet the requirements.
Qualifications
Bachelor’s or master’s degree in computer science, Information Technology, or a related field.
null