Job Description/Responsibilities:
Job Overview:
We are seeking a highly skilled Data Engineer with expertise in migrating data from on-premise Hadoop environments to AWS. The ideal candidate should have extensive experience working with Amazon Redshift and be capable of leading the design, implementation, and optimization of data workflows on AWS. You will play a key role in our cloud migration efforts, ensuring data integrity, performance optimization, and alignment with best practices.
Key Responsibilities:
Lead the end-to-end migration of data from on-prem Hadoop to AWS, with a primary focus on Amazon Redshift.
Design, build, and manage scalable ETL/ELT pipelines to move data from on-premise to cloud environments.
Develop and implement best practices for Amazon Redshift performance tuning, optimization, and maintenance.
Collaborate with data architects, database administrators, and other engineering teams to ensure seamless integration and high availability of data solutions.
Analyze existing Hadoop data architecture and propose migration strategies to AWS, leveraging AWS-native services like S3, Glue, EMR, and Lambda.
Create automated data workflows and pipelines that align with business requirements and SLAs.
Ensure data integrity, security, and compliance during the migration process.
Provide leadership and mentorship to junior engineers on AWS Redshift and cloud data technologies.
Work closely with cross-functional teams to understand data needs, providing guidance on storage, security, and access patterns in AWS.
Skills & Qualifications:
Required:
5+ years of experience in data engineering roles with strong expertise in Amazon Redshift.
Proven experience migrating data from on-prem Hadoop to AWS.
Proficient in designing and optimizing data warehouses on Amazon Redshift, including performance tuning, workload management, and query optimization.
Hands-on experience with AWS services such as S3, Glue, Lambda, EMR, and EC2.
Strong knowledge of ETL/ELT processes and tools (preferably AWS Glue or Apache Spark).
Experience in data pipeline development, automation, and orchestration.
Proficiency in SQL, Python, or other scripting languages.
Solid understanding of data security best practices on AWS (IAM, encryption, etc.).
Excellent problem-solving skills and the ability to troubleshoot complex data migration issues.
Preferred:
AWS Certified (AWS Solutions Architect, AWS Certified Big Data - Specialty, or AWS Certified Database - Specialty).
Knowledge of containerization and orchestration tools such as Docker and Kubernetes.
Experience working in Agile/Scrum environments.
Educational Background:
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
Bachelor’s or Master’s degree in Computer Science