Description

Key Responsibilities:

  • Design, develop, and maintain robust, scalable ETL/ELT pipelines using Apache Spark and Snowflake.
  • Leverage Databricks for data processing, transformation, and analytics in distributed environments.
  • Develop efficient SQL and Spark applications to process and analyze large volumes of data.
  • Implement and maintain data warehousing solutions using Snowflake with best practices for performance, cost, and security.
  • Collaborate with data scientists, analysts, and business stakeholders to meet data needs.
  • Ensure data quality and integrity through unit testing, data validation, and monitoring.
  • Optimize and troubleshoot Spark jobs, SQL queries, and Snowflake data workflows.
  • Integrate with various data sources (cloud storage, APIs, RDBMS) and tools (Airflow, DBT, etc.).
  • Apply data governance and compliance policies in data pipeline design and execution.

 

Required Qualifications:

  • Certifications:
    • SnowPro Core / Advanced Certification (e.g., SnowPro Advanced: Architect, Data Engineer, etc.)
    • Databricks Certified Associate Developer for Apache Spark (latest version preferred)
  • Experience:
    • 3+ years of experience working with Snowflake, including schema design, query optimization, and Snowpipe/Streams/Tasks.
    • 2+ years of hands-on development with Apache Spark (PySpark, Scala, or Java) in Databricks or open-source environments.
    • Strong understanding of distributed computing, data lakes, and modern data architectures.
  • Technical Skills:
    • Proficient in SQL, Spark (RDD/DataFrame APIs), and Python or Scala
    • Experience with cloud platforms (AWS, Azure, or GCP), especially integrating Snowflake and Databricks
    • Familiarity with data modeling, data quality, and orchestration tools (e.g., Airflow, Prefect)
    • Knowledge of CI/CD pipelines and version control (e.g., Git, GitHub Actions)

Education

Any Graduate