Key Responsibilities:
Day-to-Day Operations:
• Monitor and manage the health and stability of production systems on an ongoing basis.
• Lead SWAT Calls (specialized emergency response calls) to address critical incidents and minimize downtime, ensuring the stability of systems and applications.
Incident Management
Infrastructure Support
Operational Readiness
Stakeholder Coordination
Disaster Recovery
Issue Management
Qualifications:
• Bachelor’s Degree in Computer Science, Information Technology, Engineering, or a related field.
• 6+ years of experience in production support, including on-prem and cloud-based systems management.
• Hands-on experience with key production support tools and technologies such as Wily, Tivoli, HP BSM, Splunk, and Datadog for application and system monitoring.
• Expertise in managing MQ, DataPower, Apigee, and Microservices for API management, message queuing, and microservices architectures.
• Proficiency in security protocols, including OAuth, SSL, and HTTP, to ensure secure and efficient communication within production environments.
• Experience with containerization technologies, specifically Docker, and Redis for performance optimization and caching.
Mandatory Areas
Must Have Skills:
• 6+ years of experience in production support, including on-prem and cloud-based systems management.
• Hands-on experience with key production support tools and technologies such as Wily, Tivoli, HP BSM, Splunk, and Datadog for application and system monitoring.
• Expertise in managing MQ, DataPower, Apigee, and Microservices for API management, message queuing, and microservices architectures.
• Proficiency in security protocols, including OAuth, SSL, and HTTP, to ensure secure and efficient communication within production environments
Any Graduate