Key skill The role requires specific skills around developing CDM (Conceptual Data Model), LDM (Logical Data Model), PDM (Physical Data Model), data modeling, integration, data migration, and ETL development on the AWS platform.
Key Responsibilities
Architect and implement a scalable data hub solution on AWS using best practices for data ingestion transformation storage and access control
Define data models data lineage and data quality standards for the DataHub
Select appropriate AWS services S3 Glue Redshift Athena Lambda based on data volume access patterns and performance requirements
Design and build data pipelines to extract transform and load data from various sources databases APIs flat files into the DataHub using AWS Glue AWS Batch or custom ETL processes
Implement data cleansing and normalization techniques to ensure data quality
Manage data ingestion schedules and error handling mechanisms
Required Skills and Experience
AWS Expertise Deep understanding of AWS data services including S3 Glue Redshift Athena Lake Formation Sep Functions CloudWatch and EventBridge
Data Modeling Proficiency in designing dimensional and snowflake data models for data warehousing and data lakes
Data Engineering Skills Experience with ETLELT processes data cleansing data transformation and data quality checks Experience with Informatica IICS and ICDQ is a plus
Programming Languages Proficiency in Python SQL and potentially PySpark for data processing and manipulation
Data Governance Knowledge of data governance best practices including data classification access control and data lineage tracking
Skills
Mandatory Skills : GCP Data Architecture, MDM Conceptual, Data Architecture, Data lakehouse Architecture, Dimensional Data Modeling
Any Graduate