Technical Skills:
• Strong expertise in Databricks (Delta Lake, Unity Catalog, Lakehouse Architecture, Table Triggers, Delta Live Pipelines, Databricks Runtime etc.)
• Proficiency in Azure Cloud Services.
• Solid Understanding of Spark and PySpark for big data processing.
• Experience in relational databases.
• Knowledge on Databricks Asset Bundles and GitLab.
Key Responsibilities:
1. Data Pipeline Development:
• Build and maintain scalable ETL/ELT pipelines using Databricks.
• Leverage PySpark/Spark and SQL to transform and process large datasets.
• Integrate data from multiple sources including Azure Blob Storage, ADLS and other relational/non-relational systems.
2. Collaboration & Analysis:
• Work Closely with multiple teams to prepare data for dashboard and BI Tools.
• Collaborate with cross-functional teams to understand business requirements and deliver tailored data solutions.
3. Performance & Optimization:
• Optimize Databricks workloads for cost efficiency and performance.
• Monitor and troubleshoot data pipelines to ensure reliability and accuracy.
4. Governance & Security:
• Implement and manage data security, access controls and governance standards using Unity Catalog.
• Ensure compliance with organizational and regulatory data policies.
5. Deployment:
• Leverage Databricks Asset Bundles for seamless deployment of Databricks jobs, notebooks and configurations across environments.
• Manage version control for Databricks artifacts and collaborate with team to maintain development best practices.
Preferred Experience:
• Familiarity with Databricks Runtimes and advanced configurations.
• Knowledge of streaming frameworks like Spark Streaming.
• Experience in developing real-time data solutions.
Certifications:
• Azure Data Engineer Associate or Databricks certified Data Engineer Associate certification. (Optional)
Any Graduate