Lead Data Engineer

Job Responsibilities

Design, develop, and optimize large-scale data pipelines using PySpark and Python.
Implement and adhere to standard methodologies in object-oriented programming to develop reusable, maintainable code.
Write advanced SQL queries for data extraction, transformation, and loading (ETL).
Work closely with data scientists, analysts, and collaborators to gather requirements and translate them into technical solutions.
Troubleshoot data-related issues and resolve them in a timely and accurate manner.
Leverage AWS cloud services (e.g., S3, EMR, Lambda, Glue) to build and manage cloud-native data workflows (preferred).
Participate in code reviews, data quality checks, and performance tuning of data jobs.

Strong hands-on experience with PySpark and Python, especially in crafting and implementing scalable data transformations.
Solid understanding of Object-Oriented Programming (OOP) principles and design patterns.
Proficient in SQL, with the ability to write complex queries and optimise performance.
Strong problem-solving skills and the ability to troubleshoot complex data issues independently.
Excellent communication and collaboration skills.

Exposure to data warehousing concepts, distributed computing, and performance tuning.
Familiarity with version control systems (e.g., Git), CI/CD pipelines, and Agile methodologies.

Any Graduate

Back To Jobs