Description

Roles & Responsibilities:-
The Position is for a skilled and detail-oriented Data Lakehouse Engineer with a strong background in Enterprise Data Office (EDO) principles to design, build, and maintain next-generation data lake house platforms. This role combines technical excellence with data governance and compliance, supporting enterprise-wide data strategies and analytics initiatives.

 

The Position will play a key role in the following assignments:-
• Building snowflake external tables against data saved in AWS S3 in Apache Iceberg.
• Using existing pipeline to migrate data from one AWS account in S3 to another AWS account where the data is stored in Apache iceberg tables on S3.

 

Other Responsibilities:-
• Design and implement scalable data lake house architectures using modern open formats such as Apache Iceberg.
• Build and manage Snowflake external tables referencing Apache Iceberg tables stored on AWS S3.
• Develop and maintain data ingestion, transformation, and curation pipelines, supporting structured and semi-structured data.
• Collaborate with EDO, data governance, and business teams to align on standards, lineage, classification, and data quality enforcement.
• Implement data catalogs, metadata management, lineage tracking, and enforce role-based access controls.
• Ensure data is compliant with privacy and security policies across ingestion, storage, and consumption layers.
• Optimize cloud storage, compute usage, and pipeline performance for cost-effective operations

 

Required Skills:-
• Proven experience building Snowflake external tables over Apache Iceberg tables on AWS S3.
• Strong understanding of Iceberg table concepts like schema evolution, partitioning, and time travel.
• Proficient in Python, SQL and Spark or other distributed data processing frameworks.
• Solid understanding of AWS data services including S3, Glue, Lake Formation, and IAM.
• Hands-on with data governance tools (e.g., Collibra/ Alation/ Informatica) and EDO-aligned processes.
• Experience implementing metadata, lineage, and data classification standards.

 

Good to Have Skills:-
• Experience in Insurance Domain
• Experience migrating JSON files to Apache Iceberg tables on AWS S3 using configuration-driven rules.
• Familiarity with ETL orchestration tools such as Apache Airflow, DBT, or AWS Step Functions.
• Prior exposure to MDM, data quality frameworks, or stewardship platforms

 

Qualification:- BACHELOR OF COMPUTER SCIENCE

Education

BACHELOR OF COMPUTER SCIENCE

Salary

INR 00 - 00