JOB Description
- Databricks - Pyspark, Python Job, SQL
- Unity catalog
- Collibira DQ
AWS Services ex. Glue,s3,redshift,lambda etc
Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.
- Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services especially Python & Pyspark
- Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.
- Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts, and demonstrating it to stakeholders.
- Experience to work on Collibira for DQ and data governance is plus
- Knowledge on dbt to model and build out layers in dbx is plus
AWS experience on s3,redshift,glue etc is also required