Key Skills: Pyspark, AWS, Python, DynamoDb, AWS Lambda, Redshift, NoSql
Roles and Responsibilities:
- Design, develop, and maintain scalable data pipelines using Pyspark and AWS services.
- Implement and manage data storage solutions using AWS Redshift and DynamoDB.
- Utilize AWS Lambda for serverless data processing and automation tasks.
- Ensure data quality and integrity through rigorous testing and validation processes.
- Collaborate with cross-functional teams to understand data requirements and deliver solutions that meet business needs.
- Monitor and optimize data workflows for performance and cost efficiency.
- Stay updated with the latest industry trends and technologies related to data engineering and AWS services.
Skills Required:
- Proficiency in Pyspark for building data pipelines (Must-Have)
- Strong experience with AWS services including Lambda, Redshift, and DynamoDB (Must-Have)
- Solid coding skills in Python (Must-Have)
- Understanding of data modeling, data validation, and pipeline optimization
- Knowledge of NoSQL technologies (Nice-to-Have)
- Familiarity with cloud cost optimization and performance tuning
- Strong analytical thinking, problem-solving, and team collaboration skills
Education:B.Sc., B.Com., B.E., B.Tech, B.Tech-M.Tech (Dual), or equivalent Bachelor's degree