Job Description :
Technical Skills:
5 + Years of experience as AWS Data Engineer, AWS S3, Glue Catalog, Glue Crawler, Glue ETL, Athena
write Glue ETLs to convert data in AWS RDS for SQL Server and Oracle DB to Parquet format in S3
Execute Glue crawlers to catalog S3 files.
Create catalog of S3 files for easier querying
Create SQL queries in Athena
Define data lifecycle management for S3 files
Strong experience in developing, debugging, and optimizing Glue ETL jobs using PySpark or Glue Studio.
Ability to connect Glue ETLs with AWS RDS (SQL Server and Oracle) for data extraction and write transformed data into Parquet format in S3.
Proficiency in setting up and managing Glue Crawlers to catalog data in S3.
Deep understanding of S3 architecture and best practices for storing large datasets.
Experience in partitioning and organizing data for efficient querying in S3.
Knowledge of Parquet file format advantages for optimized storage and querying.
Expertise in creating and managing the AWS Glue Data Catalog to enable structured and schema-aware querying of data in S3.
Experience with Amazon Athena for writing complex SQL queries and optimizing query performance.
Familiarity with creating views or transformations in Athena for business use cases.
Knowledge of securing data in S3 using IAM policies, S3 bucket policies, and KMS encryption.
Understanding of regulatory requirements (e.g., GDPR) and implementing secure data handling practices.
Non-Technical Skills:
Candidate needs to be Good Team Player
Effective interpersonal, team building and communication skills.
Ability to communicate complex technology to no tech audience in simple and precise manner.
Any Graduate