Description

Key Skills: Azure, Site Reliability Engineer.

Roles & Responsibilities:

  • Implement and manage cloud infrastructure using Azure services.
  • Develop and maintain scripts in Bash, PowerShell, Python, and other programming languages.
  • Collaborate with multiple support groups to ensure seamless service delivery.
  • Troubleshoot complex issues across cloud services and application stacks.
  • Ensure security and compliance of cloud infrastructure.
  • Document processes and changes, adhering to agile development practices.
  • Continuously seek improvements in processes and infrastructure management.
  • Monitor system performance, reliability, and availability using tools like Azure Monitor, Application Insights, and Log Analytics.
  • Automate operational tasks and deployments using Azure DevOps, ARM templates, or Terraform.
  • Conduct regular performance tuning and capacity planning for cloud resources.
  • Participate in incident response, root cause analysis, and post-incident reviews to enhance system resilience.

Experience Required:

  • 5 - 8 years of experience managing and maintaining production workloads on Azure.
  • Proven expertise in writing infrastructure-as-code using Azure Resource Manager (ARM), Bicep, or Terraform.
  • Experience with CI/CD pipelines, release management, and automation in Azure DevOps.
  • In-depth understanding of cloud security practices, including identity management and network security in Azure.
  • Familiarity with monitoring, logging, and alerting best practices using Azure-native and third-party tools.
  • Demonstrated ability to work in Agile/Scrum teams and contribute to continuous integration and delivery cycles.
  • Experience with incident management, root cause analysis, and improving MTTR in production environments.

Education: Any Graduation

Education

Any Graduate