OVERVIEW
- Develop and manage ETL workflows using Azure Data Factory (ADF).
- Design and implement data pipelines using PySpark on Azure Databricks.
- Work with Azure Synapse Analytics, Azure Data Lake, and Azure Blob Storage for data ingestion and transformation.
- Optimize Spark jobs for performance and scalability in Databricks.
- Automate data workflows and implement error handling & monitoring in ADF.
- Collaborate with data engineers, analysts, and business teams to understand data requirements.
- Implement data governance, security, and compliance best practices in Azure.
- Debug and troubleshoot PySpark scripts and ADF pipeline failures.
- 4+ years of experience in ETL development with Azure Data Factory (ADF).
- Hands-on experience with Azure Databricks and PySpark for big data processing.
- Strong knowledge of Azure services
- Proficiency in Python and PySpark for data transformation and processing.
- Experience with CI/CD pipelines for deploying ADF pipelines and Databricks notebooks.
- Strong expertise in SQL for data extraction and transformations.
- Knowledge of performance tuning in Spark and cost optimization on Azure.
Skills:
Azure Data Factory, Pyspark, Azure