Data Engineer

Design and develop ETL/data pipelines using Databricks and Apache Spark.
Optimize and manage Spark-based workloads for scalability and performance.
Work with structured and unstructured data from multiple sources.
Implement Delta Lake for data reliability and ACID transactions and Unity Catalog for centralized governance and cataloging service in Databricks.
Develop and maintain SQL-based transformations, queries, and performance tuning.
Collaborate with data engineers, analysts, and business teams to meet data requirements.
Implement job orchestration using Airflow, Databricks Workflows, or other scheduling tools.
Ensure data security, governance, and compliance best practices.
Monitor, debug, and resolve performance bottlenecks in Databricks jobs.
Work with cloud storage solutions (AWS S3).

Qualifications:

Required Skills & Experience:

Strong experience in Databricks and AWS.
Hands-on experience with Python, PySpark, Scala, SQL.
Experience with Spark performance tuning and optimization.
Knowledge of Delta Lake, Lakehouse Architecture, and Medallion Architecture.
Familiarity with orchestration tools (Airflow, Databricks Workflows).
Hands-on experience with data modeling and transformation techniques.
Experience with cloud platform AWS.
Proficiency in CI/CD for Databricks using Git, DevOps tools.
Strong understanding of data security, governance, and access control in Databricks.
Good knowledge of APIs, REST services, and integrating Databricks with external systems.

Preferred Qualifications:

Databricks Certification (Databricks Certified Associate/Professional).
Knowledge of BI tools like Power BI, Looker, ThoughtSpot.
Bachelor's or Master's degree in Computer Science, Information Systems, Engineering or equivalent

Bachelor's or Master's degrees

Back To Jobs