Key Skills: AWS, AWS Cloud, DevOps, Linux, Kubernetes.
Roles and Responsibilities:
- Resolve product-related issues identified via monitoring tools or reported by customers, ensuring high availability and optimal performance of hosted products.
- Collaborate with cross-functional teams, including Product Development, DevOps, and Clients, to address infrastructure, networking, and security concerns.
- Participate in incident, change, and problem management activities, aligning with ITIL best practices.
- Set up and maintain infrastructure components on multi-vendor cloud platforms, mainly AWS.
- Automate deployment and configuration processes using tools such as Git, Jenkins, Terraform, and scripting.
- Monitor and optimize hosting environments using observability tools like DataDog, NewRelic, Splunk, Prometheus, and Grafana.
- Automate routine tasks like backups, scaling, and monitoring to improve efficiency and reduce manual interventions.
- Maintain detailed documentation of infrastructure configurations, automation processes, and standard operating procedures.
- Report on infrastructure health, issue resolutions, and optimization efforts to internal stakeholders.
Experience Requirement:
- 5-10 years of industry experience in cloud operations and infrastructure management.
- Strong experience supporting SaaS solutions on AWS with a focus on system performance and reliability.
- Proven background in cloud support operations, customer interaction, ticket triaging, and root cause analysis.
- Practical experience managing upgrades, releases, and automation in cloud environments.
Education: B.Tech M.Tech (Dual), MCA, B.E., B.Tech