Description

Must have/required experience and skills:
•              8+ years of experience on DevOps and Site Reliability Engineering.
•              Hands-on with containerization and orchestration: Docker, Kubernetes/EKS.
•              Proficiency in infrastructure as code tools: Terraform, Ansible, or CloudFormation.
•              Experience setting up and managing services running on Kubernetes.
•              In-depth understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation.
•              In-depth knowledge of monitoring and observability tools: Apache Splunk
•              Knowledge of Linux operating system principles, networking fundamentals, and systems management
•              Demonstrable fluency in at least one of the following languages: Java or Python
•              Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.
•              Building and managing CI/CD pipeline – gatekeeping production deployments, develop and implement GIT branching strategies, branch protection rules, network policies, scale up/ scale down the load on AWS.
•              Strong problem-solving and analytical skills
•              Solve performance issues and scalability issues in the system

Education

Any Graduate