Roles & Responsibilities
• Build, design and implement scalable data lake architecture using AWS S3 and Lakeformation
• Build and optimize ETL Pipelines using AWS Glue, EMR and Spark
• Implement Event-Driven workflows using EventBridge, SNS and SQS
• Design and query datasets using Athena
• Manage metadata and data lineage using Amazon Neptune DB
• Expose APIs for data subscription and sharing using API Gateway and AWS Lambda
• Automate infrastructure provisioning using Terraform and CloudFormation
• Ensure data security and compliance by implementing robust IAM policies and access controls.
• Develop and maintain a self-serve data portal using Angular and integrate it with backend services
• Strong Experience with AWS Services: S3, Lake Formation, Glue, EMR, Athena, Lambda, EventBridge, SNS, SQS
• Proficiency in Apache Spark for distributed data processing
• Hands on experience with Terraform or CloudFormation
• Familiarity with Amazon Neptune DB for graph-based metadata management
• Strong understanding of Data lake architecture, data governance and security best practices
• Proficiency in SQL, Python and Spark for Data Engineering tasks
Any Gradute