Data Lakehouse Engineer

Ventures Unlimited Inc.
Plano, TX, USA

Description

Roles & Responsibilities:-
The Position is for a skilled and detail-oriented Data Lakehouse Engineer with a strong background in Enterprise Data Office (EDO) principles to design, build, and maintain next-generation data lake house platforms. This role combines technical excellence with data governance and compliance, supporting enterprise-wide data strategies and analytics initiatives.

The Position will play a key role in the following assignments:-
• Building snowflake external tables against data saved in AWS S3 in Apache Iceberg.
• Using existing pipeline to migrate data from one AWS account in S3 to another AWS account where the data is stored in Apache iceberg tables on S3.

Other Responsibilities:-
• Design and implement scalable data lake house architectures using modern open formats such as Apache Iceberg.
• Build and manage Snowflake external tables referencing Apache Iceberg tables stored on AWS S3.
• Develop and maintain data ingestion, transformation, and curation pipelines, supporting structured and semi-structured data.
• Collaborate with EDO, data governance, and business teams to align on standards, lineage, classification, and data quality enforcement.
• Implement data catalogs, metadata management, lineage tracking, and enforce role-based access controls.
• Ensure data is compliant with privacy and security policies across ingestion, storage, and consumption layers.
• Optimize cloud storage, compute usage, and pipeline performance for cost-effective operations

Required Skills:-
• Proven experience building Snowflake external tables over Apache Iceberg tables on AWS S3.
• Strong understanding of Iceberg table concepts like schema evolution, partitioning, and time travel.
• Proficient in Python, SQL and Spark or other distributed data processing frameworks.
• Solid understanding of AWS data services including S3, Glue, Lake Formation, and IAM.
• Hands-on with data governance tools (e.g., Collibra/ Alation/ Informatica) and EDO-aligned processes.
• Experience implementing metadata, lineage, and data classification standards.

Good to Have Skills:-
• Experience in Insurance Domain
• Experience migrating JSON files to Apache Iceberg tables on AWS S3 using configuration-driven rules.
• Familiarity with ETL orchestration tools such as Apache Airflow, DBT, or AWS Step Functions.
• Prior exposure to MDM, data quality frameworks, or stewardship platforms

Qualification:- BACHELOR OF COMPUTER SCIENCE

Key Skills

Snowflake Aws S3 Apache Iceberg Data Lakehouse Data Catalogs Metadata Management Python Sql And Spark Iam Json

Education

BACHELOR OF COMPUTER SCIENCE

Salary

INR 00 - 00

Apply Now

Back To Jobs

Posted On: 5 days Ago
Experience: 5+ years of experience
Availability: Remote
Openings: 1
Category: Data Lakehouse Engineer
Tenure: Flexible Position