Description

  • Design, deploy, and manage highly available and scalable infrastructure on AWS.
  • Automated infrastructure provisioning and configuration using tools like Terraform and Ansible.
  • Develop and implement monitoring and alerting systems to proactively identify and troubleshoot incidents.
  • Optimize infrastructure costs on AWS through resource management and utilization Analysis
  • Collaborate with development teams to implement DevOps practices and ensure smooth deployments.
  • Participate in on-call rotations and diligently respond to incidents to minimize downtime
  • Continuously improve infrastructure reliability and performance through automation and best practices.
  • Stay up to date with the latest trends and technologies in cloud computing and SRE principles. 


     

Qualifications :

 

  • 3+ years of experience in Site Reliability Engineering or a related field ( Devops )
  • Proven expertise in deploying and managing infrastructure on AWS (EC2, S3, VPC, etc. )
  • Experience in Linux OS is a must. Prior experience as a Linux administrator a plus .
  • Strong understanding of networking fundamentals is a must.
  • Strong knowledge of infrastructure automation tools like Terraform and Ansible
  • Experience with DevOps methodologies and CI/CD pipelines
  • A keen understanding of cost optimization principles in AWS
  • Excellent problem-solving and analytical skills
  • Ability to work independently and as part of a cross-functional team
  • Diligent and proactive approach to incident response
  • Willingness to participate in on-call rotations 


     

Good To have skills:

 

  • Experience with SOC compliance frameworks (SOC 2, HIPAA, etc.)
  • Experience with container orchestration tools (Kubernetes) 
     

Education

Any Gradute