The successful candidate will fulfil three main missions: monitor, support and deploy applications.
Application monitoring: implementing & maintaining dashboards and tools to monitor application health and performance, participate to recurring reviews, detect & analyse anomalies,
Application support: as a support engineer, help platform users (internal teams), troubleshoot issues with the platform, dispatch and follow up with other more specialized teams (development, infrastructure) or third party vendors, or investigate incorrect application behaviours,
Application deployment: prepare and implement new application releases rollouts and migrations.
Prepare & execute procedures for new application releases/updates
Troubleshoot application issues so that other teams can fix or work around them – implement restoration procedures as needed; follow up on raised issues.
Monitor applications health & performance
Improve and maintain tools to monitor applications health (mainly Grafana & Splunk dashboards & alerts)
Participate in 24/7 on-call guard duties on rotation basis
Help improve team processes (planned works, ticket management, communication flows...)
Schedule planned works as requested
Review procedures so that they minimize customer impacts
Make sure communication to impacted users is done accordingly and timely
Write RCAs (root cause analysis) in case of production incidents