Description

We are seeking a highly skilled Senior Data Engineer to join our team and help build scalable data pipelines, integrate machine learning workflows, and optimize data platforms for actionable insights. This role plays a critical part in enabling data-driven solutions for sustainability initiatives and innovation.

Responsibilities:

  • Design, implement, and optimize scalable data pipelines using SQL, Python, and PySpark for efficient data processing.
  • Collaborate with data scientists to integrate machine learning models into production pipelines, optimizing for performance.
  • Manage and enhance ETL workflows to ensure timely and accurate transformation of raw data into structured formats.
  • Work with cloud platforms like AWS, Azure, Databricks, and Snowflake to manage and scale data infrastructure.
  • Implement and maintain data orchestration tools like Apache AirFlow to automate ETL processes.
  • Utilize Terraform to manage infrastructure as code for scalable cloud solutions.
  • Work on data warehousing solutions to optimize storage and retrieval of data.
  • Ensure strong data governance practices, including data quality, compliance, and cataloging using tools like Unity Catalog or Hive Metastore.

Primary Skills:

  • Strong expertise in SQL and Python for data manipulation and pipeline creation.
  • Proficiency in PySpark and hands-on experience with ETL processes.
  • Hands-on experience with cloud platforms (AWS, Azure) and big data tools like Databricks and Snowflake.
  • Experience with machine learning and AI integration into data pipelines.
  • Proficiency with orchestration tools like Apache AirFlow.
  • Experience using Terraform for infrastructure as code.
  • Expertise in data warehousing concepts and solutions

Education

Any Gradute