Description

Job Responsibilities:

Kubernetes Operator Expertise – Deploy, manage, and maintain Kubernetes operator-based applications in cloud and on-prem environments. 
CRD-Based Deployments – Implement and troubleshoot Custom Resource Definition (CRD)-based deployments to enhance automation and operational efficiency. 
Region Awareness & Pod Topology Spread Constraints – Configure Kubernetes workloads with pod topology spread constraints to achieve high availability and fault tolerance across multiple regions. 
Node Affinity & Scheduling Policies – Apply node selector and affinity rules to optimize pod scheduling and resource allocation across nodes. 
Cluster Deployment & Upgrades – Troubleshoot and optimize cluster deployments, operator installations, and rolling updates to ensure smooth and reliable system upgrades. 
Incident Management & Troubleshooting – Diagnose and resolve infrastructure and application issues by analyzing logs, metrics, and alerts. 
Customer Support & Ticket Handling – Work on customer tickets, provide effective solutions, and collaborate with development teams to resolve issues efficiently. 
Application Monitoring & Optimization – Utilize monitoring tools to analyse application performance and implement improvements. 
Documentation & Knowledge Sharing – Create and maintain technical documentation, troubleshooting guides, and best practices for internal teams and customers. 
Automation & CI/CD Integration – Improve deployment efficiency by implementing automation, Infrastructure as Code (IaC), and CI/CD pipelines using tools.

Requirements

Must Have Skills:


Education: B.Tech in computer engineering, Information Technology, or related field. 
Experience: 5+ years of Experience with Kubernetes Operator Expertise. Having in depth knowledge on deploy, manage, maintain and pod topology. 
CRD-Based Deployments: 3+ Years of in-depth experience to implement and trouble shoot CRD. 
Application Monitoring & Optimization: 3+ Years of experience in using tools such as Grafana, Prometheus 
Terraform or Helm: 2+ years of experience in using terraform or Helm for infrastructure Automation & CI/CD Integration. 
Bash, Python, or Golang: 2+ years of experience and in depth understanding of scripting tools.

Nice-to-Have Skills:

CAK certification will be good to have. 
Familiarity with incident response and disaster recovery planning. 
Strong understanding of container security and best practices for securing Kubernetes workloads. 
Experience working with log aggregation tools like ELK Stack

Education

Any Graduate