Description

Key Skills: ITIL, ITSM, Production Support, Linux, UNIX.

Roles & Responsibilities:

  • Manage production incidents to resolution in a 24/7/365 environment, utilizing  incident management processes and keeping management informed of status, impact, and resolution actions.
  • Lead and guide incident triage calls from a technical perspective, analyzing infrastructure and application components using event monitoring solutions like APM.
  • Influence technical teams during calls and articulate troubleshooting steps effectively.
  • Conduct technical follow-up calls for high-profile incidents.
  • Ensure proper functional and management escalation as per standards and procedures.
  • Follow up on items that may negatively impact production operations, assist with post-mortem activities, and support operational improvements.
  • Implement new and improved processes based on management recommendations, create reports, and address ad-hoc requests.
  • Analyze infrastructure and application components during incident triage calls.
  • Communicate effectively with all management levels, translating technical issues into non-technical terms, and manage large conference calls during incidents.
  • Hands-on experience with ServiceNow or other ticketing tools is required.

Experience Requirement:

  • 6 - 9 years of experience in incident management and ITSM practices.
  • Proven track record of managing critical production incidents in a high-pressure 24/7 environment.
  • Strong technical acumen in infrastructure and applications, with the ability to lead troubleshooting calls effectively.
  • Experience working with cross-functional technical teams and coordinating escalations efficiently.
  • Background in production support and working knowledge of Linux/UNIX systems is preferred.

Education: Any Graduation

Education

Any Graduate