Description

Key skill The role requires specific skills around developing CDM (Conceptual Data Model), LDM (Logical Data Model), PDM (Physical Data Model), data modeling, integration, data migration, and ETL development on the AWS platform.

Key Responsibilities

Architect and implement a scalable data hub solution on AWS using best practices for data ingestion transformation storage and access control

Define data models data lineage and data quality standards for the DataHub

Select appropriate AWS services S3 Glue Redshift Athena Lambda based on data volume access patterns and performance requirements

Design and build data pipelines to extract transform and load data from various sources databases APIs flat files into the DataHub using AWS Glue AWS Batch or custom ETL processes

Implement data cleansing and normalization techniques to ensure data quality

Manage data ingestion schedules and error handling mechanisms

Required Skills and Experience

AWS Expertise Deep understanding of AWS data services including S3 Glue Redshift Athena Lake Formation Sep Functions CloudWatch and EventBridge

Data Modeling Proficiency in designing dimensional and snowflake data models for data warehousing and data lakes

Data Engineering Skills Experience with ETLELT processes data cleansing data transformation and data quality checks Experience with Informatica IICS and ICDQ is a plus

Programming Languages Proficiency in Python SQL and potentially PySpark for data processing and manipulation

Data Governance Knowledge of data governance best practices including data classification access control and data lineage tracking

Skills

Mandatory Skills : GCP Data Architecture, MDM Conceptual, Data Architecture, Data lakehouse Architecture, Dimensional Data Modeling

Education

Any Graduate