Description

Roles & Responsibilities

• Build, design and implement scalable data lake architecture using AWS S3 and Lakeformation

• Build and optimize ETL Pipelines using AWS Glue, EMR and Spark

• Implement Event-Driven workflows using EventBridge, SNS and SQS

• Design and query datasets using Athena

• Manage metadata and data lineage using Amazon Neptune DB

• Expose APIs for data subscription and sharing using API Gateway and AWS Lambda

• Automate infrastructure provisioning using Terraform and CloudFormation

• Ensure data security and compliance by implementing robust IAM policies and access controls. 

• Develop and maintain a self-serve data portal using Angular and integrate it with backend services

• Strong Experience with AWS Services: S3, Lake Formation, Glue, EMR, Athena, Lambda, EventBridge, SNS, SQS

• Proficiency in Apache Spark for distributed data processing

• Hands on experience with Terraform or CloudFormation 

• Familiarity with Amazon Neptune DB for graph-based metadata management

• Strong understanding of Data lake architecture, data governance and security best practices

• Proficiency in SQL, Python and Spark for Data Engineering tasks

Education

Any Gradute