Key Responsibilities:
Architect observability, on-call setup, runbooks & chaos drills
Define and enforce SLOs & error budgets
Improve incident response and automate recovery
Implement tooling and a clear Dev-SRE handoff model
Tech Stack:
Prometheus, Grafana, ELK, Open Telemetry, Kubernetes, Terraform, Python, AWS/Azure/GCP
Requirements:
5+ years in SRE/DevOps/Production Engineering
Strong in observability, incident response, and infra-automation
Proven track record of running systems at scale
Job Types: Full-time, Contractual / Temporary, Freelance
Contract length: 12 months
Pay: ₹90,000.00 - ₹110,000.00 per month
Benefits:
- Work from home
Schedule:
- Monday to Friday
Work Location: Remote
नौकरी रिपोर्ट करें