Monitor real time z OS system health and performance across CPU, memory, DASD, and WLMmanaged workloads, using tools including RMF, SmartIS, IzPCA, MICS, and other internal tools. Analyze performance data to identify trends, bottlenecks, and potential issues.
Detect, troubleshoot, and resolve resource anomalies, workload misbehaviors, and degradation risks in production systems. Partner with incident response teams to resolve performance issues quickly and accurately.
Develop and implement performance tuning strategies by recommending changes to service definitions, dispatching priorities, and workload placement.
Contribute to capacity planning by forecasting and modeling workload resource demand & capacity requirements.
Support cost modeling, vendor reporting SCRT infrastructure sizing and resource optimization efforts.
Collect and analyze system performance data to generate reports and dashboards.
Identify key performance indicators KPIs and develop metrics to track system performance.
Visualize, summarize and present data findings, recommendations, and methodology to senior leadership, department leadership and enterprise stakeholders technical and non technical stakeholders
Work closely with cross functional teams, including operations, development, and infrastructure teams.
Provide technical support and guidance to team members and stakeholders.
Participate in on call rotations and provide timely responses to performance and observability issues.
Participate in migration of performance capacity tooling to Git change management and DevOps deployment pipelines.
Bachelor s degree in information systems, Mathematics, Finance or another quantitative or related subject
10+ years of mainframe systems experience with proficiency in performance management for large, multi processor, multi LPAR, Parallel Sysplex environments utilizing z OS
Proven experience in mainframe performance monitoring, observability, capacity management, and data analysis.
Proven experience resolving systems performance problems in real time via adjustments to WLM and batch initiators.
Strong understanding of PR SM.
Proficiency in REXX Python, Job Control Language JCL DB2
Strong understanding of Batch Processing and Job Scheduling
Advanced user of MS Excel Charts, Pivot tables, Vlookups, PowerPivot and PowerPoint for data visualization.
Experience with mainframe monitoring tools and performance tuning techniques.
Experience working with large highly transactional datasets to draw insights and create organizational value.
Experience with DevOps is a plus Experience working with ADABAS is a plus
Strong analytical, problem solving and strategic thinking skills including the ability to prioritize.
Proficiency in data analysis and creation of dashboards and various visualizations that will provide actionable insights for operations, engineering and management decision making.
Working knowledge of MS Office suite for management reporting using both PowerPoint and Excel analytical functions
Comfortable taking information from disparate systems to bring data elements together for meaningful insights
Analyze and solve business problems at their root, stepping back to understand the broader context.
Detail oriented with strong organizational skills.
Deep expertise with SMF RMF data, WLM service definitions, and z/OS workload behavior.
IBM SCRT Reporting plus
MICS Product & Reporting experience huge plus
iZPCA Product experience plus
Bachelor's degree