Job Description
Manage and optimize data pipelines for medallion architecture (Landing, Bronze, Silver,
Gold) using AWS S3.
· Develop and execute data transformation workflows using Python scripts.
· Design, implement, and maintain scalable data processing solutions using PySpark for
reading and writing data across the medallion architecture.
· Collaborate with stakeholders to perform data cuts and variance analysis to support
business insights and decision-making.
· Ensure seamless integration and performance optimization of the PostgreSQL
application database.
· Monitor, maintain, and improve the quality and accuracy of data across all layers of the
architecture.
Key Skills and Requirements:
· Proficiency in Python and PySpark with a strong understanding of object-oriented
programming principles.
· Deep knowledge of the AWS ecosystem, including S3, IAM, and other related services.
· Experience with medallion architecture for data organization and transformation.
· Strong expertise in data variance analysis, creating data cuts, and deriving actionable
insights.
· Solid understanding of PostgreSQL and database management.
· Familiarity with handling large-scale data processing in distributed environments.
· Analytical mindset with a focus on troubleshooting and problem-solving.
Preferred Qualifications:
· Experience with other AWS services (e.g., Lambda, Glue, Redshift) is a plus.
· Strong communication skills to collaborate effectively with cross-functional teams.
· Prior experience in designing ETL pipelines in cloud-based environments
Any Graduate