Key Responsibilities:
- Develop and optimize PySpark-based ETL pipelines for processing large-scale data.
- Develop, test, and maintain robust Python applications.
- Collaborate with data engineers, analysts, and other developers to design and implement data solutions
- Work with Azure Data Lake, Azure Databricks, and Azure Data Factory to build scalable data solutions.
- Implement data classification, policy enforcement, and metadata extraction within Purview.
- Collaborate with data engineers and business teams to ensure smooth data flow across systems.
- Troubleshoot performance bottlenecks in Spark jobs and improve data pipeline efficiency.
- Ensure data security, compliance, and quality following best practices.
Required Skills & Qualifications:
- 3+ years of experience in Python and PySpark for big data processing.
- Proven experience as a Python Developer or similar role.
- Strong experience with Azure Data Services (Data Lake, Data Factory).
- Excellent problem-solving skills and ability to work in an agile environment.
- Excellent communication and teamwork abilities