Responsibilities:
· Manage and troubleshoot ADF and Databricks workflows, ensuring triggers, linked services, parameters, and pipelines function correctly end-to-end.
· Investigate and resolve complex job failures; debug Spark jobs, and analyze notebook execution graphs and logs.
· Lead performance optimization for ADF pipelines, partitioning strategies, and ADLS data formats (e.g., Parquet tuning).
· Execute and automate data pipeline deployment using Azure DevOps, ARM templates, PowerShell scripts, and Git repositories.
· Govern data lifecycle rules, partition retention, and enforce consistency across raw/curated zones in ADLS.
· Monitor resource consumption (clusters, storage, pipelines) and advise on cost-saving measures (auto-scaling, tiering, concurrency).
· Prepare RCA for P1/P2 incidents and support change deployment validation, rollback strategy, and UAT coordination.
· Review Power BI refresh bottlenecks, support L1 Power BI developer with dataset tuning and refresh scheduling improvements.
· Validate SOPs and support documentation prepared by L1s, and drive process improvement via automation or standardization.
Required Skills
· Expert in Azure Data Factory, Databricks (PySpark), Azure Data Lake Storage, Synapse.
· Proficient in Python, PySpark, SQL/SparkSQL, and JSON configurations.
· Familiar with Azure DevOps, Git for version control, and CI/CD automation.
· Hands-on with monitoring (Azure Monitor), diagnostics, and cost governance.
· Strong understanding of data security practices, IAM, RBAC, and audit trail enforcement
Any Graduate