Key Skills: Java, Shell, Devops
Roles and Responsibilities:
- Run and monitor production environments with a comprehensive view of system health and availability.
- Operate core banking ledger and switch systems, applying domain expertise to support mission-critical workloads.
- Build and manage scalable platform infrastructure and applications in AWS and Azure.
- Develop and maintain functional and non-functional testing frameworks for communication protocols like HTTPS, Kafka (AVRO/JSON), AS2805 (SNMP), ODBC, and MQ.
- Automate system reliability, cybersecurity controls, logging, monitoring, and alerting strategies.
- Perform chaos and performance testing, fault isolation, and resilience improvements.
- Provide operational support and incident response for distributed software systems.
- Lead and conduct post-mortem investigations, identify root causes using data, and define preventative remediation actions.
- Drive infrastructure as code and object-oriented automation with focus on maintainability and reusability.
Key Role Activities:
- Gather and analyze system/application metrics for performance tuning and fault diagnostics.
- Collaborate with development teams to enhance platform capabilities and ensure robust release processes.
- Engage in platform architecture reviews, system design consultations, and capacity planning.
- Implement and maintain automated deployments, cloud-native services, and system provisioning.
- Participate in Agile ceremonies, backlog grooming, sprint planning, and retrospectives.
- Maintain and evolve platform service-level objectives (SLOs), balancing speed and reliability.
- Contribute to full Software Development Life Cycle (SDLC) activities including build, test, deploy, and monitor.
- Mentor junior team members and promote a culture of learning and continuous improvement.
Skills Required:
- Strong hands-on experience with Java and Shell scripting.
- Deep understanding of DevOps tools, CI/CD pipelines, and cloud infrastructure.
- Proven experience in cloud platforms (AWS, Azure) with focus on automation, observability, and security.
- Familiarity with Kafka, MQ, AS2805, ODBC, and message protocols.
- Proficiency in system monitoring, logging frameworks, and site reliability principles.
- Knowledge of performance tuning, failure analysis, and incident management best practices.
- Solid grasp of object-oriented programming, testing, and platform security concepts.
Education:
- Bachelor's degree in Computer Science, Information Systems, or a related technical field.
- Certifications in AWS, Azure, or DevOps tools are an advantage