Description

Key Responsibilities:

Day-to-Day Operations:

• Monitor and manage the health and stability of production systems on an ongoing basis.

• Lead SWAT Calls (specialized emergency response calls) to address critical incidents and minimize downtime, ensuring the stability of systems and applications.

 

Incident Management

Infrastructure Support

Operational Readiness

Stakeholder Coordination

Disaster Recovery

Issue Management

 

Qualifications:

• Bachelor’s Degree in Computer Science, Information Technology, Engineering, or a related field.

• 6+ years of experience in production support, including on-prem and cloud-based systems management.

• Hands-on experience with key production support tools and technologies such as Wily, Tivoli, HP BSM, Splunk, and Datadog for application and system monitoring.

• Expertise in managing MQ, DataPower, Apigee, and Microservices for API management, message queuing, and microservices architectures.

• Proficiency in security protocols, including OAuth, SSL, and HTTP, to ensure secure and efficient communication within production environments.

• Experience with containerization technologies, specifically Docker, and Redis for performance optimization and caching.

 

Mandatory Areas
Must Have Skills:
• 6+ years of experience in production support, including on-prem and cloud-based systems management.
• Hands-on experience with key production support tools and technologies such as Wily, Tivoli, HP BSM, Splunk, and Datadog for application and system monitoring.
• Expertise in managing MQ, DataPower, Apigee, and Microservices for API management, message queuing, and microservices architectures.
• Proficiency in security protocols, including OAuth, SSL, and HTTP, to ensure secure and efficient communication within production environments
 

Education

Any Graduate