Job Description:
We are seeking a seasoned DevOps Engineer with a strong background in Site Reliability Engineering (SRE) to join our dynamic team in Bangalore. The ideal candidate will have a proven track record in implementing scalable infrastructure, enhancing system reliability, and driving automation for critical applications.
Key Responsibilities:
Design, implement, and maintain CI/CD pipelines for enterprise-scale applications
Ensure high availability and performance of cloud-based platforms through SRE principles
Monitor system reliability, proactively identify issues, and drive resolution
Automate operational tasks and infrastructure provisioning using tools like Terraform, Ansible, or similar
Collaborate with development teams to integrate reliability into software delivery
Lead root cause analysis and post-mortem activities for production incidents
Required Skills & Experience:
8+ years of hands-on DevOps experience, with at least 3 years in SRE-related functions
Deep understanding of Linux/Unix systems and networking concepts
Proficiency with cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes, Docker)
Strong scripting skills (Shell, Python, or Go)
Familiarity with observability tools such as Prometheus, Grafana, ELK Stack, or similar
Excellent communication and incident management skills
Preferred Qualifications:
Certifications in AWS/GCP/Azure or Kubernetes
Exposure to infrastructure as code (IaC) and security practices
Any Graduate