SRE Lead

Rang Technologies Inc
Charlotte, NC, USA

Description

Responsibilities:

· Participate in design, architecture of reliable, scalable, and high-performance systems and services with a focus on operational excellence, availability, and performance.

· Primary skillset to be expertise in Observability as service, Telemetry data collection using Dynatrace APM, SolarWinds, Open-Source tools (Prometheus and Grafana), Log Aggregations (Kibana or Splunk) and AIOPS Tools

· Deeper understanding of Login authentication mechanisms using Ping, ForgeRock and SiteMinder technologies (session management and cookie management)

· Correlation mechanisms and dashboards to have end to end visibility of requests from external to internal applications.

· Evangelize SRE evolution within IT operations and promoting a culture of engineering excellence and best practices.

· Define best practices and principles for SRE, including incident management, monitoring, alerting, and automation.

· Collaborate with development teams on resiliency to ensure that services and applications are designed with operational reliability in mind.

· Implement monitoring systems to assess the performance of applications and infrastructure, and proactively identifying areas for optimization.

· Understanding incident and problem management process, post-mortems, and driving improvements to prevent future incidents.

· Analyze resource utilization patterns and forecasting future capacity needs to ensure optimal performance and cost-efficiency.

· Ensure that SRE practices align with security and compliance requirements and implementing measures to protect systems and data.

· Operational excellence with focus on automation and developing tools to streamline operational tasks and increase efficiency.

· Provide guidance and mentorship to SRE teams, fostering skill development, and building a strong and capable SRE practice.

· Ability to develop close relationship with other operational teams to integrate SRE practices and drive overall operational improvements across enterprise.

· Stay up to date on industry trends, new technologies, and best practices in SRE and applying relevant advancements to the organization.

· Ability to build strong working relationships across different levels, client focus mindset.

Qualifications:

· Around 10-12 years of SRE hands on experience with cloud technologies, development, SRE toolsets and automation

· Deeper understanding of Login authentication mechanisms using Ping, ForgeRock and SiteMinder technologies (session management and cookie management)

· Correlation mechanisms and dashboards to have end to end visibility of requests from external to internal applications.

· Strong hands-on experience with any Cloud Technology (AWS): Control Tower, Project Setup, Creating Accounts, RDS, SSO

· Solid understanding and hands on experience with Docker/Kubernetes

· Should have good experience with Linux Commands, GitLab CICD Setup and Terraform (state management, etc)

· Monitoring & alerting setup experience with Splunk, Prometheus, Grafana, Kibana, ELK etc.

· Hands on APM Tool/s experience, preferably Datadog or AppDynamics or Dynatrace

· Good understanding of Observability Framework leveraging programmatic SLI/SLO blueprints to standardize the collection of golden signals.

· Should have automation (data refresh, releases, DB snapshots) experience using Ansible or any other scripting languages

· Experience with following languages (Groovy-DSL, Java, Python, Yaml and microservices architecture)

· Good understanding and hands on experience with MQ, Kafka

· Experience with Databases (Oracle, MySQL)

Good to have:

· Any of the relevant professional certifications – Certified Site Reliability Engineer (CSRE), Certified Kubernetes Administrator (CKA), AWS Certified DevOps Engineer Professional, , Google Cloud Professional; DevOps Engineer

Key Skills

Dynatrace Apm Solarwinds Prometheus Grafana Splunk Gitlab Java Python Yaml

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: 15+ Days Ago
Experience: 10+ years of experience
Openings: 1
Category: SRE Lead
Tenure: Flexible Position