AWS Devops Engineer

DATAECONOMY
Hyderabad, Telangana, India

Description

We are seeking an experienced Observability Engineer with a strong DevOps background to design, implement, and manage observability solutions across cloud and on-prem environments. The ideal candidate will have expertise in monitoring, logging, tracing, and alerting to ensure high system availability, performance, and reliability.

Key Responsibilities:

Design & Implement Observability Solutions: Develop and maintain monitoring, logging, and tracing solutions using industry-leading tools (Prometheus, Grafana, Datadog, New Relic, Splunk, etc.).
Performance Monitoring & Optimization: Ensure proactive identification and resolution of performance bottlenecks in distributed systems.
Logging & Tracing: Set up and manage centralized logging solutions (ELK/EFK stack, Fluentd, OpenTelemetry).
Alerting & Incident Management: Configure alerting mechanisms using tools like PagerDuty, Ops genie, or VictorOps for proactive issue detection.
SRE Practices: Implement Site Reliability Engineering (SRE) principles to enhance system reliability and reduce MTTR (Mean Time to Resolution).
Automation & Infrastructure as Code (IaC): Automate observability setup and configurations using Terraform, Ansible, or similar tools.
Cloud & Kubernetes Monitoring: Implement observability best practices for cloud platforms (AWS, Azure, GCP) and containerized environments (Kubernetes, Docker).
Collaboration: Work closely with development, SRE, and operations teams to ensure end-to-end observability of applications and services.
Compliance & Security: Ensure logging and monitoring solutions adhere to security and compliance requirements.

Requirements

Required Skills & Qualifications:

6-10 years of experience in DevOps, SRE, or Observability engineering.
Strong hands-on experience with observability tools like Prometheus, Grafana, New Relic, Datadog, Splunk, ELK/EFK, OpenTelemetry, AppDynamics, etc.
Experience in setting up distributed tracing solutions (Jaeger, Zipkin, OpenTelemetry).
Expertise in Kubernetes monitoring using Prometheus, Thanos, Loki, or similar tools.
Strong proficiency in scripting (Python, Bash, Shell) for automation.
Hands-on experience with Terraform, Ansible, Helm, or CloudFormation for infrastructure automation.
Proficiency in CI/CD pipelines and GitOps methodologies using Jenkins, GitLab CI, ArgoCD, or Flux.
Experience in public cloud environments (AWS, Azure, GCP) and monitoring cloud-native services.
Strong troubleshooting and root cause analysis (RCA) skills.
Understanding of SLIs, SLOs, and error budgets as part of SRE best practices.
Familiarity with log management, anomaly detection, and AI-based observability solutions is a plus

Key Skills

Aws Azure Gcp Terraform Ansible Helm Jenkins Gitlab Ci Argocd Python

Education

Any Gradute

Apply Now

Back To Jobs

Posted On: 30+ Days Ago
Experience: 10+ years of experience
Openings: 1
Category: AWS DevOps Engineer
Tenure: Full-Time Position