Description

  • Databricks Platform: Act as a subject matter expert for the Databricks platform within the Digital Capital team, provide technical guidance, best practices, and innovative solutions.
  •  Databricks Workflows and Orchestration: Design and implement complex data pipelines using Azure Data Factory or Qlik replicate.
  •  End-to-End Data Pipeline Development: Design, develop, and implement highly scalable and efficient ETL/ELT processes using Databricks notebooks (Python/Spark or SQL) and other Databricks-native tools.
  • Delta Lake Expertise: Utilize Delta Lake for building reliable data lake architecture, implementing ACID transactions, schema enforcement, time travel, and optimizing data storage for performance.
  •  Spark Optimization: Optimize Spark jobs and queries for performance and cost efficiency within the Databricks environment. Demonstrate a deep understanding of Spark architecture, partitioning, caching, and shuffle operations.
  • Data Governance and Security: Implement and enforce data governance policies, access controls, and security measures within the Databricks environment using Unity Catalog and other Databricks security features
  • Collaborative Development: Work closely with data scientists, data analysts, and business stakeholders to understand data requirements and translate them into Databricks based data solutions.
  • Monitoring and Troubleshooting: Establish and maintain monitoring, alerting, and logging for Databricks jobs and clusters, proactively identifying and resolving data pipeline issues.
  • Code Quality and Best Practices: Champion best practices for Databricks development, including version control (Git), code reviews, testing frameworks, and documentation
  • Performance Tuning: Continuously identify and implement performance improvements for existing Databricks data pipelines and data models.
  •  Cloud Integration: Experience integrating Databricks with other cloud services (e.g., Azure Data Lake Storage Gen2, Azure Synapse Analytics, Azure Key Vault) for a seamless data ecosystem.
  • Traditional Data Warehousing & SQL: Design, develop, and maintain schemas and ETL processes for traditional enterprise data warehouses. Demonstrate expert-level proficiency in SQL for complex data manipulation, querying, and optimization within relational database systems

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
  •  Minimum of 6+ years of relevant experience in data engineering, with a significant portion dedicated to building and managing data solutions.
  • Demonstrable expert-level proficiency with Databricks, including:
  • Extensive experience with Spark (PySpark, Spark SQL) for large-scale data processing.
  • Deep understanding and practical application of Delta Lake.
  •  Hands-on experience with Databricks Notebooks, Jobs, and Workflows. o Experience with Unity Catalog for data governance and security.
  •  Proficiency in optimizing Databricks cluster configurations and Spark job performance.
  • Strong programming skills in Python.
  • Expert-level SQL proficiency with a strong understanding of relational databases, data warehousing concepts, and data modeling techniques (e.g., Kimball, Inmon).
  •  Solid understanding of relational and NoSQL databases.
  • Experience with cloud platforms (preferably Azure, but AWS or GCP with Databricks experience is also valuable).
  • Excellent problem-solving, analytical, and communication skills.
  • Ability to work independently and collaboratively in a fast-paced environment

Mandatory Skills

  • Azure Databricks
  • Azure Data Factory (ADF)
  • GITHub
  • SQL
  • PySpark
  • CI/CD
  • Data Modelling - 3NF and Dimensional
  • Lakehouse/Medallion  Architecture
  • ADLS Gen2

Nice to have skills

  • Scala
  • Qlik Replicate
  • Talend Data Integration
  • AWS Aurora Postgres/Redshift/S3
  • DevOps
  • Agile
  • Service Now

Preferred Qualifications:

  • Databricks Certifications (e.g., Databricks Certified Data Engineer Associate/Professional).
  •  Experience with CI/CD pipelines for data engineering projects

Education

Bachelor's degree