Description

  • Manage and ensure reliability/operations of large-scale, high-performance applications in hybrid (on-prem & cloud) environments, with a minimum of 3-5 years’ experience.
  • Develop automation scripts and build dashboards for Application Performance Management, focusing on transaction journey tracking.
  • Program using languages such as Go, Python, Java, or Rust (2-4 years’ experience required).
  • Work with databases like Oracle, PL/SQL, SQL Server, Redis, Clickhouse, Postgres, MongoDB, or time-series databases.
  • Transition and manage platforms on cloud services (GCP, AWS, Rancher/Cloud Formation/Azure/OpenShift) and maintain containerized applications (GKE/RKE/AKE); at least 2+ years’ experience.
  • Implement and maintain cloud observability using OTEL for real-time monitoring, distributed tracing, and incident resolution.
  • Utilize GraphQL frameworks (Apollo, Prisma, Hasura) for application development and troubleshooting.
  • Troubleshoot networking issues (TCP/IP, HTTP, DNS, Load Balancing, Service Mesh) under high-pressure situations.
  • Ensure 24x7 application availability, develop solutions for repetitive tasks, and improve detection/gating for critical applications.
  • Use monitoring tools (Splunk, AppDynamics, Grafana/Prometheus, Dynatrace) to manage application health.
  • Participate in CI/CD processes, leveraging tools such as Rally, Confluence, and extenders.
  • Implement and manage in-memory caching solutions, especially Redis.
  • Debug across integrated technical platforms, including API gateways.
  • Work with cloud databases (GCS, Cloud SQL, PL/SQL, Spanner).
  • Monitor and troubleshoot HashiCorp Vault environments to minimize downtime and ensure rapid incident recovery.
  • Apply working knowledge of Vertex AI, Gen AI, and BigQuery.
  • Communicate clearly and effectively with technical and non-technical stakeholders.
  • Healthcare industry experience is required

Education

Any Gradute