Data Engineer (Databricks)

VARITE INC
Bangalore, Karnataka, India

Description

Databricks Platform: Act as a subject matter expert for the Databricks platform within the Digital Capital team, provide technical guidance, best practices, and innovative solutions.
Databricks Workflows and Orchestration: Design and implement complex data pipelines using Azure Data Factory or Qlik replicate.
End-to-End Data Pipeline Development: Design, develop, and implement highly scalable and efficient ETL/ELT processes using Databricks notebooks (Python/Spark or SQL) and other Databricks-native tools.
Delta Lake Expertise: Utilize Delta Lake for building reliable data lake architecture, implementing ACID transactions, schema enforcement, time travel, and optimizing data storage for performance.
Spark Optimization: Optimize Spark jobs and queries for performance and cost efficiency within the Databricks environment. Demonstrate a deep understanding of Spark architecture, partitioning, caching, and shuffle operations.
Data Governance and Security: Implement and enforce data governance policies, access controls, and security measures within the Databricks environment using Unity Catalog and other Databricks security features
Collaborative Development: Work closely with data scientists, data analysts, and business stakeholders to understand data requirements and translate them into Databricks based data solutions.
Monitoring and Troubleshooting: Establish and maintain monitoring, alerting, and logging for Databricks jobs and clusters, proactively identifying and resolving data pipeline issues.
Code Quality and Best Practices: Champion best practices for Databricks development, including version control (Git), code reviews, testing frameworks, and documentation
Performance Tuning: Continuously identify and implement performance improvements for existing Databricks data pipelines and data models.
Cloud Integration: Experience integrating Databricks with other cloud services (e.g., Azure Data Lake Storage Gen2, Azure Synapse Analytics, Azure Key Vault) for a seamless data ecosystem.
Traditional Data Warehousing & SQL: Design, develop, and maintain schemas and ETL processes for traditional enterprise data warehouses. Demonstrate expert-level proficiency in SQL for complex data manipulation, querying, and optimization within relational database systems

Qualifications:

Bachelor's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
Minimum of 6+ years of relevant experience in data engineering, with a significant portion dedicated to building and managing data solutions.
Demonstrable expert-level proficiency with Databricks, including:
Extensive experience with Spark (PySpark, Spark SQL) for large-scale data processing.
Deep understanding and practical application of Delta Lake.
Hands-on experience with Databricks Notebooks, Jobs, and Workflows. o Experience with Unity Catalog for data governance and security.
Proficiency in optimizing Databricks cluster configurations and Spark job performance.
Strong programming skills in Python.
Expert-level SQL proficiency with a strong understanding of relational databases, data warehousing concepts, and data modeling techniques (e.g., Kimball, Inmon).
Solid understanding of relational and NoSQL databases.
Experience with cloud platforms (preferably Azure, but AWS or GCP with Databricks experience is also valuable).
Excellent problem-solving, analytical, and communication skills.
Ability to work independently and collaboratively in a fast-paced environment

Mandatory Skills

Azure Databricks
Azure Data Factory (ADF)
GITHub
SQL
PySpark
CI/CD
Data Modelling - 3NF and Dimensional
Lakehouse/Medallion Architecture
ADLS Gen2

Nice to have skills

Scala
Qlik Replicate
Talend Data Integration
AWS Aurora Postgres/Redshift/S3
DevOps
Agile
Service Now

Preferred Qualifications:

Databricks Certifications (e.g., Databricks Certified Data Engineer Associate/Professional).
Experience with CI/CD pipelines for data engineering projects

Key Skills

Python Spark Sql Azure Azure Databricks Adf Github Ci/cd Scala Service Now

Education

Bachelor's degree

Apply Now

Back To Jobs

Posted On: 2 days Ago
Experience: 6+ years of experience
Openings: 1
Category: data engineer
Tenure: Contract - Corp-to-Corp Position