Description

Key Responsibilities

Design, develop, and maintain robust, scalable ETL/ELT pipelines using Apache Spark and Snowflake.
Leverage Databricks for data processing, transformation, and analytics in distributed environments.
Develop efficient SQL and Spark applications to process and analyze large volumes of data.
Implement and maintain data warehousing solutions using Snowflake with best practices for performance, cost, and security.
Collaborate with data scientists, analysts, and business stakeholders to meet data needs.
Ensure data quality and integrity through unit testing, data validation, and monitoring.
Optimize and troubleshoot Spark jobs, SQL queries, and Snowflake data workflows.
Integrate with various data sources (cloud storage, APIs, RDBMS) and tools (Airflow, DBT, etc.).
Apply data governance and compliance policies in data pipeline design and execution.

Required Qualifications

Certifications:
SnowPro Core / Advanced Certification (e.g., SnowPro Advanced: Architect, Data Engineer, etc.)
Databricks Certified Associate Developer for Apache Spark (latest version preferred)
Experience:
3+ years of experience working with Snowflake, including schema design, query optimization, and Snowpipe/Streams/Tasks.
2+ years of hands-on development with Apache Spark (PySpark, Scala, or Java) in Databricks or open-source environments.
Strong understanding of distributed computing, data lakes, and modern data architectures.
Technical Skills:
Proficient in SQL, Spark (RDD/DataFrame APIs), and Python or Scala
Experience with cloud platforms (AWS, Azure, or GCP), especially integrating Snowflake and Databricks
Familiarity with data modeling, data quality, and orchestration tools (e.g., Airflow, Prefect)
Knowledge of CI/CD pipelines and version control (e.g., Git, GitHub Actions)a

Education

Any Graduate