Responsibilities | SRE
Monitor systems and infrastructure to maintain operational and performance levels
Rotational on-call responsibilities
Work closely with other SRC professionals/engineers when issues arise, collaborate on troubleshooting, and provide consultation/resolution with events/incidents
Anticipate potential problems before they become impacting and collaborate to determine solutions
Gather and analyze metrics from tools and system/application logs to assist in performance tuning, fault finding, and resolution
Create sustainable systems and services through automation, processes enhancement, tools, and noise reduction
Build automation to manage the SRC operations and eliminate/minimize manual functions and toil
Collaborate with Application/Infrastructure support engineers and operations teams
Engage in post-incident reviews for improvements and determining the cause to prevent recurrence
Required Knowledge, Skills, and Abilities
Possess a breadth and depth of technical and management knowledge
Continuous improvement mindset, always looking for opportunities to streamline, routinize, or automate
Working knowledge across technology the following support areas:
Server: Administration and troubleshooting in Linux and Windows as well as patching and basic scripting skills (PowerShell, Bash)
Converged Solutions: Experience in VCE/UCP (including VMWare versions 6 and above), platform and network connectivity, and patching – understanding of current threat analysis and remediation trends, alongside PowerShell and Linux scripting skills
Storage: CIFS/NFS, Linux and Windows scripting, DPA reporting, Avamar and Data Domain administration, and solid understanding of Windows and Linux environments
Middleware: Linux, Windows, WebSphere, Apache, IIS, WebLogic and Tomcat
Mainframes: JCL, CICS SYSPLEX
Networking: Strong understanding of the network protocols and OSI Model, as well as Network+ Certification
Workflow and Knowledge Management: ServiceNow
Collaboration Tools: TrueSight, Jira, and Confluence
Process: Skilled and knowledgeable in ITSM; proficiency in operations analytics methodologies to drive performance improvement (e.g., Lean)
Strong troubleshooting and problem-solving skills, with the ability to analyze and resolve complex technical issues
ITIL fundamentals
Familiarity with Problem Management, Change Management, Release Management, Event Management, and Incident Management
Bachelor's Degree