Description

Responsibilities | SRE

Monitor systems and infrastructure to maintain operational and performance levels

Rotational on-call responsibilities

Work closely with other SRC professionals/engineers when issues arise, collaborate on troubleshooting, and provide consultation/resolution with events/incidents

Anticipate potential problems before they become impacting and collaborate to determine solutions

Gather and analyze metrics from tools and system/application logs to assist in performance tuning, fault finding, and resolution

Create sustainable systems and services through automation, processes enhancement, tools, and noise reduction

Build automation to manage the SRC operations and eliminate/minimize manual functions and toil

Collaborate with Application/Infrastructure support engineers and operations teams

Engage in post-incident reviews for improvements and determining the cause to prevent recurrence


Required Knowledge, Skills, and Abilities

Possess a breadth and depth of technical and management knowledge

Continuous improvement mindset, always looking for opportunities to streamline, routinize, or automate

Working knowledge across technology the following support areas:

Server: Administration and troubleshooting in Linux and Windows as well as patching and basic scripting skills (PowerShell, Bash)

Converged Solutions: Experience in VCE/UCP (including VMWare versions 6 and above), platform and network connectivity, and patching – understanding of current threat analysis and remediation trends, alongside PowerShell and Linux scripting skills

Storage: CIFS/NFS, Linux and Windows scripting, DPA reporting, Avamar and Data Domain administration, and solid understanding of Windows and Linux environments

Middleware: Linux, Windows, WebSphere, Apache, IIS, WebLogic and Tomcat

Mainframes: JCL, CICS SYSPLEX

Networking: Strong understanding of the network protocols and OSI Model, as well as Network+ Certification

Workflow and Knowledge Management: ServiceNow

Collaboration Tools: TrueSight, Jira, and Confluence

Process: Skilled and knowledgeable in ITSM; proficiency in operations analytics methodologies to drive performance improvement (e.g., Lean)

Strong troubleshooting and problem-solving skills, with the ability to analyze and resolve complex technical issues

ITIL fundamentals

Familiarity with Problem Management, Change Management, Release Management, Event Management, and Incident Management

Education

Bachelor's Degree