5+ years hands-on admin experience at the platform and application tiers supporting critical Customer Facing applications preferably in the Financial Services Industry
5+ years of experience troubleshooting environments across the entire architecture (i.e., applications to infrastructure)
3+ years of hands-on Linux administration experience
3+ experience with Oracle SQL, MongoDB, Redis, Kafka, Flink, Postgres, or similar data technologies
1+ Years supporting and monitoring service load balancing architectures including F5 & VMware AVI
Hard Skills:
Site Reliability Engineer (SRE) Skills – Ability to apply your expert troubleshooting and optimization skills up and down the full stack, ensuring that critical applications/services are engineered for scalability, availability & resiliency including graceful degradation of service, fault isolation and quick recovery to minimize customer impact
Provide highly advanced technical expertise to maximize efficiency, reliability and value from current solutions, infrastructure, platforms and emerging technologies, showing technical leadership, and driving continuous improvement efforts
Ability to identify root-cause issues, articulate corrective actions and improvement opportunities, and design approaches/programs/products to improve overall quality assurance
Strong knowledge of Observability/Monitoring tools & their application (i.e., Glassbox, Dynatrace, AppDynamics, Splunk, BigPanda AIOps, etc.)
Intermediate/expert level ability to use automation and configuration management tools for provisioning using Puppet, Ansible, Terraform, Chef, Jenkins, GitLab and Liquibase.
Functional knowledge of programming scripting such as JavaScript, PowerShell, Python, Bash, SQL, .NET, Java, PHP, Ruby, PERL, C++, R, etc.