Job Description:
• Preferably 3+ years' experience working as a Site Reliability Engineer (SRE).
• Practical experience with Monitoring tools, such as: Grafana, Azure Monitor, Log
Analytics, Network Monitoring and Alerting Tools (i.e. Big Panda).
• Experience with Automation Tooling, such as: Azure Open AI, Amelia
Automation, Service Now Orchestration, Power Apps / Power Platform, Python
and PowerShell.
• Experience in monitoring infrastructure, analysing Dashboards and investigating
issues / incidents affecting the health, stability and performance of products and
services that we support.
• Knowledge of how to identify and resolve issues with systems, services and
applications.
• Able to proactively drive continuous improvement opportunities, including
performance, cost, process and stability optimisations, through working closely
with development and infrastructure teams.
• Good foundational understanding of Agile Methodologies, AI/ML for automating
operational initiatives and ITIL / Change Management processes.
• Knowledge of core Azure Cloud computing concepts (AZ-900 Certification as a
minimum requirement, with AZ-104 certification preferred).
• Knowledge of Azure Chaos Studio for Chaos Engineering.
• Fluent in English, both written and spoken.
• Good report writing and documentation skills, and a strong verbal
communicator.
• Able to present concepts, ideas, and recommendations in a clearly structured
and logical fashion.
• Self-motivated, able to prioritize, with excellent time management and goal
driven.
• Quick learner with ability to grasp modern technologies.
• Collaborative nature; willing to share ideas, debate solutions, to achieve
consensus on the best way to drive the service forward for the betterment of the
organization.
• Able to work in a high pressure and time critical environment.
• Attention to detail, keen eye on due diligence
• Experience working in a financial or regulated sector, aware of best practices
Any Graduate