Description

Role Summary 

We are looking for a DevOps Engineer to build, automate, and manage cloud infrastructure while ensuring high availability, security, and scalability. The ideal candidate will have hands-on experience in a production environment with AWS and infrastructure automation using Terraform, CI/CD pipelines, and container orchestration (Kubernetes, ECS, or EKS).

 

This role requires a strong understanding of cloud networking, monitoring, and security best practices. The candidate should be proficient in scripting languages such as Python, Bash, or Go to automate deployments and operational tasks.

 

The DevOps Engineer will be responsible for troubleshooting and optimizing cloud environments, enhancing observability using Prometheus, Grafana, ELK, or Datadog, and ensuring smooth incident management and system reliability. Strong problem-solving skills, a proactive mindset, and the ability to collaborate with developers and SRE teams are key to success in this role.

 

Preferred experience includes exposure to AWS, configuration management tools (Ansible, Helm), and security hardening techniques.

 

Tech Stack Expertise:

  • Kubernetes: Deep understanding of Kubernetes clusters, container orchestration, and its architecture.
  • Terraform: Extensive hands-on experience with Infrastructure as Code (IaC) using Terraform for managing cloud resources.
  • ArgoCD: Experience in continuous deployment and using ArgoCD to maintain GitOps workflows.
  • Helm: Expertise in Helm for managing Kubernetes applications.
  • Cloud Platforms: Expertise in AWS. GCP or Azure will be an added advantage.
  • Debugging and Troubleshooting: The DevOps Engineer must be proficient in identifying and resolving complex issues in a distributed environment, ranging from networking issues to misconfigurations in infrastructure or application components.

 

Key Responsibilities:

  • CI/CD and configuration management
  • Doing RCA of production issues and providing resolution
  • Setting up failover, DR, backups, logging, monitoring, and alerting
  • Containerizing different applications on the Kubernetes platform
  • Capacity planning of different environment's infrastructure
  • Ensuring zero outages of critical services
  • Database administration of SQL and NoSQL databases
  • Infrastructure as a code (IaC)
  • Keeping the cost of the infrastructure to the minimum
  • Setting up the right set of security measures

 

Ideal Candidate Profile:

  • A graduation/post-graduation degree in Computer Science and related fields
  • 2-4 years of strong DevOps experience with the Linux environment.
  • Strong interest in working in our tech stack
  • Excellent communication skills
  • Worked with minimal supervision and love to work as a self-starter
  • Hands-on experience with at least one of the scripting languages - Bash, Python, Go etc
  • Experience with version control systems like Git
  • Strong experience of Amazon Web Services (EC2, RDS, VPC, S3, Route53, IAM etc.)
  • Strong experience with managing the Production Systems day in and day out
  • Experience in finding issues in different layers of architecture in a production environment and fixing them
  • Knowledge of SQL and NoSQL databases, ElasticSearch, Solr etc.
  • Knowledge of Networking, Firewalls, load balancers, Nginx, Apache etc.
  • Experience in automation tools like Ansible/SaltStack and Jenkins
  • Experience in Docker/Kubernetes platform and managing OpenStack (desirable)
  • Experience with Hashicorp tools i.e. Vault, Vagrant, Terraform, Consul, VirtualBox etc. (desirable)
  • Experience with managing/mentoring small teams of 2-3 people (desirable)
  • Experience in Monitoring tools like Prometheus/Grafana/Elastic APM.
  • Experience in logging tools Like ELK/Loki

Education

Any Graduate