Description

Monitoring & Central Dashboard

 

  • Lead the design and implementation of enterprise-wide monitoring solutions for infrastructure and applications.
  • Architect centralized dashboards for real-time visibility into system health, performance, and alerts using AIOps platforms.
  • Ensure proactive incident detection and resolution through event monitoring systems and technical staff interventions.

     

Disaster Recovery

 

  • Own the disaster recovery strategy and execution across business-critical systems.
  • Conduct risk assessments, DR drills, and ensure alignment with business continuity objectives and compliance standards.
  • Collaborate with IT and business units to ensure recovery plans are aligned and tested regularly.

     

ServiceNow & ITSM

 

  • Oversee ServiceNow modules including Incident, Problem, Change, and Request Management.
  • Ensure all incidents are logged, categorized, and prioritized in ServiceNow with complete lifecycle documentation.
  • Integrate email-based incident creation and automate workflows for faster resolution.

     

KPI/SLA Governance

 

  • Define and track KPIs and SLAs across IT operations and service areas.
  • Generate regular reports for stakeholders and ensure SLA adherence, especially for P1/P2 incidents.
  • Lead governance meetings (CCB-CMT, DSR) and bridge calls for critical incidents.

     

AIOps & Automation

 

  • Drive AIOps initiatives to automate root cause analysis, anomaly detection, and predictive maintenance.
  • Collaborate with cross-functional teams to implement AI-driven insights into operational workflows.
  • Support continuous improvement through digitization and transformation programs.

     

Process Improvement

 

  • Optimize ITIL-aligned processes for Incident, Problem, and Change Management.
  • Maintain comprehensive documentation for IT processes and workflows.
  • Implement quality control measures to reduce error rates and backlog tickets.

     

Required Skills

 

  • 10+ years in IT operations and infrastructure support.
  • Hands-on experience with ServiceNow, monitoring tools (e.g., Moogsoft , SolarWinds, Dynatrace), and DR technologies.
  • Strong understanding of ITIL processes, SLA governance, and AIOps platforms.
  • Proficiency in automation scripting (PowerShell, Python) and dashboarding tools.
  • Excellent communication and stakeholder management skills

Education

Any Gradute