Description

Key Responsibilities:

  • Design, develop, and manage scalable, efficient ETL/ELT pipelines for ingesting, processing, and transforming large volumes of data.
  • Build and maintain data lakes, data warehouses, and data marts on cloud platforms like AWS, Azure, or GCP.
  • Collaborate with data scientists, analysts, and business stakeholders to gather requirements and deliver data solutions.
  • Optimize and monitor data pipelines for performance, reliability, and scalability.
  • Implement data quality, governance, and security best practices.
  • Write complex SQL queries and perform data modeling to support business intelligence and analytics.
  • Integrate data from various sources (APIs, databases, flat files, etc.).
  • Participate in code reviews, architecture discussions, and contribute to team development standards.

Required Skills:

  • 6+ years of experience in data engineering or related roles.
  • Proficient in Python, Scala, or Java for data processing.
  • Strong experience with SQL and relational databases (e.g., PostgreSQL, MySQL).
  • Hands-on with big data tools: Spark, Hadoop, Hive, Kafka.
  • Deep experience with cloud platforms (AWS – Redshift, S3, Glue, EMR; Azure – Data Factory, Synapse; GCP – Big Query, Dataflow).
  • Experience with data orchestration tools (e.g., Apache Airflow, Prefect).
  • Solid understanding of data warehousing concepts, data lakes, and data modeling.
  • Familiarity with DevOps practices, version control (Git), CI/CD for data pipelines.

Education

Any Graduate