We are looking for a highly skilled Site Reliability Engineer (SRE) with strong experience in Node.js and Java to help us scale and maintain high-performance, resilient, and secure systems. You'll collaborate with software engineers, DevOps, and platform teams to improve observability, automate operations, and ensure system reliability in production environments.
Key Responsibilities:
Design, build, and maintain scalable and reliable infrastructure for microservices built in Node.js and Java.
Develop monitoring and alerting strategies (e.g., Prometheus, Grafana, ELK, Datadog) to improve system observability.