Description

  • Design, develop, and maintain ETL Notebook orchestration pipelines using PySpark and Microsoft Fabric. 
  • Working with Apache Delta Lake tables, Change Data Feed (CDF), Lakehouses and custom libraries 
  • Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver efficient data solutions. 
  • Migrate and integrate data from legacy SQL Server environments into modern data platforms. 
  • Optimize data pipelines and workflows for scalability, efficiency, and reliability. 
  • Provide technical leadership and mentorship to junior developers and other team members. 
  • Troubleshoot and resolve complex data engineering issues related to performance, data quality, and system scalability. 
  • Debugging of code, breaking down to test components, identify issues and resolve 
  • Develop, maintain, and enforce data engineering best practices, coding standards, and documentation. 
  • Conduct code reviews and provide constructive feedback to improve team productivity and code quality. 
  • Support data-driven decision-making processes by ensuring data integrity, availability, and consistency across different platforms. 

Qualifications:   

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field. 
  • 10+ years of experience in data engineering, with a strong focus on ETL development using PySpark or other Spark-based tools. 
  • Proficiency in SQL with extensive experience in complex queries, performance tuning, and data modeling. 
  • Experience with Microsoft Fabric or similar cloud-based data integration platforms is a plus. 
  • Strong knowledge of data warehousing concepts, ETL frameworks, and big data processing. 
  • Familiarity with other data processing technologies (e.g., Hadoop, Hive, Kafka) is an advantage. 
  • Experience working with both structured and unstructured data sources. 
  • Excellent problem-solving skills and the ability to troubleshoot complex data engineering issues. 
  • Experience with Azure Data Services, including Azure Data Factory, Azure Synapse, or similar tools. 
  • Experience of creating DAG's, implementing activities, and running Apache Airflow 
  • Familiarity with DevOps practices, CI/CD pipelines and Azure DevOps

Education

Bachelor's or Master's degrees