Job Title: DevOps/SRE Lead
Location: Pune / Hyderabad
Job Type: Fulltime
Experience: 15+ years
Job Overview:
We are seeking a highly experienced DevOps/SRE Lead with over 15 years of professional experience. The ideal candidate will possess a deep understanding of DevOps principles, extensive experience in Site Reliability Engineering (SRE), and a strong background in cloud infrastructure, automation, and continuous integration/continuous delivery (CI/CD). This role requires a strategic thinker and a hands on leader who can guide our team to deliver worldclass infrastructure solutions.
Key Responsibilities:
Leadership and Strategy:
- Lead and mentor a team of DevOps and SRE engineers, fostering a culture of collaboration and continuous improvement.
- Develop and implement a strategic roadmap for infrastructure and reliability engineering.
- Drive the adoption of best practices and innovative solutions in DevOps and SRE.
Infrastructure Design and Implementation:
- Architect, deploy, and maintain scalable, resilient, and secure cloud infrastructure.
- Utilize Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation for automation and management.
Continuous Integration/Continuous Deployment (CI/CD):
- Design and manage robust CI/CD pipelines using tools like Jenkins, GitLab CI, or Azure DevOps.
- Ensure seamless integration and deployment of applications across various environments.
Site Reliability Engineering (SRE):
- Implement and enhance monitoring, logging, and alerting systems to ensure high availability and performance of services.
- Develop and maintain operational dashboards and metrics for proactive incident management and troubleshooting.
- Perform root cause analysis and postmortem assessments for system outages and incidents.
Automation and Tooling:
- Develop and maintain automation scripts and tools to streamline operations and improve efficiency.
- Implement automated testing, configuration management, and deployment strategies.
Collaboration and Stakeholder Management:
- Work closely with development, QA, and operations teams to ensure smooth project delivery.
- Communicate effectively with stakeholders to align infrastructure strategies with business objectives.
Security and Compliance:
- Implement security best practices and ensure compliance with industry standards and regulations.
- Conduct regular security assessments and audits of the infrastructure.
Qualifications:
- Bachelor's or Master’s degree in Computer Science, Engineering, or a related field.
Key Skills:
- 15+ years of experience in IT with a strong focus on DevOps, SRE, and cloud architecture.
- Proven leadership experience with the ability to mentor and develop a high performing team.
- Deep expertise in cloud platforms (AWS, Azure, GCP) and hybrid environments.
- Extensive experience with Infrastructure as Code (IaC) tools like Terraform, CloudFormation, and Ansible.
- Strong proficiency in CI/CD tools and practices.
- Comprehensive understanding of containerization and orchestration tools (Docker, Kubernetes, etc.).
- Solid understanding of networking, security, and database management in cloud environments.
- Excellent problem-solving skills and ability to work in a fast paced, dynamic environment.
- Strong communication and collaboration skills.
- Relevant certifications (AWS Certified Solutions Architect, Azure Solutions Architect Expert, Google Cloud Professional Cloud Architect, etc.) are highly desirable.
Preferred Skills:
- Experience with serverless computing and microservices architecture.
- Knowledge of agile methodologies and frameworks.
- Experience with hybrid cloud environments and cloud migration strategies.
- Familiarity with AI/ML and big data technologies.