Key Responsibilities:
- Application Monitoring and Support:
- Monitor application performance, availability, and health using GCP tools like Cloud logger.
- Investigate and resolve incidents related to applications and integrations.
- Incident and Problem Management:
- Act as the first point of contact for application-related issues.
- Perform root cause analysis and implement preventive measures to avoid recurring issues.
- Ensure timely resolution of tickets within agreed SLAs.
- Automation and Optimization:
- Automate routine operational tasks to improve support efficiency.
- Identify and implement optimizations for better performance and cost-efficiency.
- Documentation and Reporting:
- Maintain up-to-date documentation for application configurations, workflows, and known issues.
- Generate reports on application performance and incident trends.
- Collaboration and Communication:
- Work closely with stakeholders, including development, operations, and business teams, to ensure alignment on requirements and issue resolutions.
- Communicate status updates effectively to all relevant parties.
Qualifications:
Required Skills and Qualifications:
- Cloud Expertise:
- Strong knowledge of GCP services such as Compute Engine, Cloud Run, App Engine, Kubernetes Engine (GKE), Cloud Storage, and Pub/Sub.
- Experience with GCP monitoring and logging tools
- Troubleshooting Skills:
- Proficiency in diagnosing application, infrastructure, and network-related issues.
- Familiarity with debugging tools, log analysis, and error tracking systems