Job Description :
Job Responsibilities:
- Collaborate with architects, engineers, and business users on product design and features.
- Develop the ETL pipelines to bring data from source systems to the staging AWS S3 buckets.
- Develop data ingestion framework into raw tables using advanced PySpark programs.
- Understand the data model and organize the raw data into structural Hubs.
- Design ETL pipelines using Pyspark, Scala and Snowsql to transform data from Hubs and Satellites to Data Vault2.0 structure.
- Implement Data quality and SOX Controls and Schedule batch jobs using Airflow and Autosys.
- Design Test Cases and perform various Testing techniques to validate the development.
Education:
Bachelor’s degree in Computer Science, Computer Information Systems or a closely related engineering field of study and some related work experience in the relevant field