Description

Job Description

Manage and optimize data pipelines for medallion architecture (Landing, Bronze, Silver,

Gold) using AWS S3.

· Develop and execute data transformation workflows using Python scripts.

· Design, implement, and maintain scalable data processing solutions using PySpark for

reading and writing data across the medallion architecture.

· Collaborate with stakeholders to perform data cuts and variance analysis to support

business insights and decision-making.

· Ensure seamless integration and performance optimization of the PostgreSQL

application database.

· Monitor, maintain, and improve the quality and accuracy of data across all layers of the

architecture.

Key Skills and Requirements:

· Proficiency in Python and PySpark with a strong understanding of object-oriented

programming principles.

· Deep knowledge of the AWS ecosystem, including S3, IAM, and other related services.

· Experience with medallion architecture for data organization and transformation.

· Strong expertise in data variance analysis, creating data cuts, and deriving actionable

insights.

· Solid understanding of PostgreSQL and database management.

· Familiarity with handling large-scale data processing in distributed environments.

· Analytical mindset with a focus on troubleshooting and problem-solving.

Preferred Qualifications:

· Experience with other AWS services (e.g., Lambda, Glue, Redshift) is a plus.

· Strong communication skills to collaborate effectively with cross-functional teams.

· Prior experience in designing ETL pipelines in cloud-based environments

Education

Any Graduate