Responsibilities
Work with Data architects to understand current data models, to build pipelines for data ingestion and transformation.
Design, build, and maintain a framework for pipeline observation and monitoring, focusing on reliability and performance of jobs.
Ensure data governance, security, and compliance best practices.
Monitor data pipeline performance and reliability.
Work in an Agile/Scrum environment, participating in sprint planning, reviews, and continuous improvement initiatives.
Qualifications
5+ years of experience in building and maintaining Databricks: Delta Live Tables, Unity Catalog, Data Lakehouse, Medallion Architecture.
Surface data integration errors to the proper teams focusing on:
Ensuring timely processing of new data
Performance of data pipelines
Integrity and quality of source data
Hands-on experience building data-lake style infrastructures using streaming data set technologies (particularly with Apache Kafka)
Development experience utilizing:
Python: (Pandas/Numpy, Boto3, SimpleSalesforce)
Databricks (pySpark, pySQL, DLT)
Apache Spark
Terraform
Familiarity with data security, compliance, and governance frameworks (e.g., GDPR, HIPAA).
Strong problem-solving and analytical skills, with the ability to troubleshoot data pipeline issues effectively.
Self-starter who thrives in a fast-paced, collaborative environment, working closely with multiple teams.
Excellent communication and stakeholder management skills, with the ability to translate business requirements into technical solutions.
Any Graduate