What You'll Build
Multi-agent LLM orchestration using LangGraph / LangChain for autonomous incident response
RAG pipelines over runbooks, topology graphs, and observability telemetry (Pinecone / pgvector)
Knowledge graph-based topology reasoning using Neo4j
Integration with OpenTelemetry, Prometheus, Grafana, and Datadog
Auto-PR generation for Terraform, Helm, and Kubernetes manifests
Bidirectional ServiceNow / Jira integration via MCP server
Required Skills
LangGraph or LangChain (multi-agent orchestration — not just chatbots)
Kubernetes, Helm, ArgoCD, or Terraform
Python (FastAPI preferred), async, production-grade code
OpenTelemetry or similar observability stack
RAG pipeline design (chunking, retrieval, reranking)
3+ years of relevant experience; IIT/IIIT/NIT background preferred
Bonus Skills
Neo4j or graph databases
TimescaleDB or InfluxDB (time-series anomaly detection)