Key Skills: Site Reliability Engineer, Datadog, Weblogic Integration, Oracle
Roles and Responsibilities:
- Utilize the Roles API to create and manage Datadog roles, ensuring appropriate global permissions are granted.
- Collaborate with product managers, engineers, and business teams to identify pain points and design Datadog dashboards that address issues related to APM, alerting, and auto ticket creation.
- Engage directly with business stakeholders and customer IT teams to understand their needs and provide effective solutions.
- Leverage knowledge in Integration Cloud Service (OIC), Service Oriented Architecture (SOA), and WebLogic Integrations to enhance system integrations.
- Maintain awareness of the criticality of financial month-end, quarter-end, and year-end closures, ensuring that monitoring solutions support these processes.
Skills Required:
- Strong hands-on experience with Datadog, including dashboards, alerts, APM, integrations, and the Roles API
- Proven track record in Site Reliability Engineering (SRE), ensuring performance, scalability, and high availability of systems
- Good understanding of WebLogic Integration and middleware platforms
- Familiarity with Oracle technologies, particularly in application and database environments
- Ability to work with OIC (Oracle Integration Cloud) and SOA for complex enterprise system integrations
- Experience with performance monitoring, root cause analysis, and automated ticket creation
- Strong communication skills for effective collaboration with cross-functional and client-facing teams
- Capability to support high-availability systems during critical financial periods (month-end, quarter-end, year-end)
Education: B.E., B.Tech in Computer Science or related field