Description

Job Description

About the Role:
We are looking for a skilled Python Data Engineer with expertise in Test-Driven Development (TDD) to build, optimize, and maintain scalable data pipelines. The ideal candidate will have strong experience in Python, SQL, data engineering workflows, and automated testing to ensure high-quality, reliable data solutions.
Key Responsibilities

  • Design, develop, and maintain ETL/ELT pipelines using Python and SQL.
  • Implement TDD practices, writing unit tests before developing data pipelines.
  • Develop scalable, efficient, and reusable data processing solutions.
  • Work with data warehouses (e.g., Azure Databricks) and databases (PostgreSQL, MySQL).
  • Ensure data quality, validation, and integrity through rigorous testing.
  • Collaborate with data scientists, analysts, and engineers to define data requirements.
  • Automate data workflows using orchestration tools (Airflow, Prefect, Dagster).
  • Optimize data pipelines for performance, reliability, and scalability.
  • Document data models, pipelines, and test cases for maintainability.

Required Skills & Qualifications

  • 8+ years of experience in data engineering.
  • Strong proficiency in Python and SQL.
  • Expertise in Test-Driven Development (TDD) using pytest, unittest, or other frameworks.
  • Hands-on experience with ETL/ELT processes and data pipeline automation.
  • Experience working with data warehouses (e.g., Azure Databricks).
  • Familiarity with data modeling principles (e.g., star schema, normalization).
  • Experience with version control (Git) and CI/CD practices.
  • Strong problem-solving skills and ability to optimize database performance.

Preferred Qualifications

  • Experience with orchestration tools (Airflow, Prefect, Dagster).
  • Knowledge of cloud platforms (Azure).
  • Exposure to containerization tools (Docker, Kubernetes).
  • Understanding of data governance, security, and compliance.

Education

Any Graduate