Description

Responsibilities:

• Design, develop, and maintain high-performance and scalable platform

• Collaborate with development and operations teams to drive standardization for cloud platform across

• Provide technical leadership and mentorship to other engineers

• Create high-quality technical documentation, including requirements specifications, use cases, test strategies, performance benchmarks, deployment plans, and feasibility studies.

• Troubleshoot and resolve production issues, ensuring system stability and reliability.

• Continuously seek opportunities to improve system performance, security, and user experience.


 

All About You:

• Experience with cloud platforms (AWS, Azure) and containerization technologies (Kubernetes, Docker).

• Experience with modern monitoring and observability tools (Dynatrace, Prometheus, Grafana, Datadog.).

• Strong understanding of distributed systems, high availability, and failure recovery.

• Familiarity with chaos engineering practices and tools (e.g., Gremlin, Chaos Monkey).

• Strong leadership and team collaboration skills.

• Deep understanding of service-level management, incident response, and root cause analysis.

• Excellent problem-solving and troubleshooting skills.

• Strong programming and scripting skills (e.g., Python, Go, Bash, Java, C#).

• Familiarity with CI/CD pipelines and automation frameworks.

AWS, Azure, Kubernetes, Docker, Dynatrace, Prometheus

Datadog, Python, Go, Bash, Java, C#


 

Education

Any Graduate