We are looking for a Site Reliability Engineer (SRE) to join our vibrant team. If you are passionate about cloud platforms, automation, and building resilient systems, this role is for you!
Define, implement, and evaluate Service Level Objectives (SLO) and Service Level Indicators (SLI).
Drive cloud operations on Azure and leverage your software development expertise in high-level programming languages.
Build and optimize automated pipelines, incident reporting alerts, and handle MTTR/MTTM concepts.
Work on Blue/Green deployments and Canary configurations.
Perform Root Cause Analysis (RCA) and lead Problem Management initiatives.
Collaborate in Agile Scrum teams and lead improvements for a better, faster, and happier work environment.
Cloud Platforms: Azure
Programming: Python/Java/Go
Automation: Azure DevOps (YAML, ARM), Terraform, Jenkins, Chef, Octopus Deploy
Containerization: Azure Kubernetes Service, Docker, Kubernetes
CI/CD Tools: SonarQube, Checkmarx, Git
Database Management: Oracle, SQL Server, NoSQL (CosmosDB)
Testing Tools: Selenium, Postman, JMeter, TestNG, Specflow
Monitoring Tools: Splunk (required), ELK, DataDog, Grafana/Prometheus
Scripting: PowerShell, Bash
B.E./B.Tech/M.E./M.Tech in Computer Science, Information Technology, Electrical, or Electronics with a solid academic background.
Proven experience in Cloud Infrastructure and SRE methodologies.
Strong understanding of Software Development Life Cycle (SDLC) and Agile/Scrum methodologies.
Excellent communication skills and ability to interact with US-based clients.
Flexibility to work in US hours (CST/EST).
Certifications in cloud technologies are a plus!
B.E./B.Tech/M.E./M.Tech in Computer Science