Key Skills: Numpy, SQL, Pandas, Python, Data Engineer, AWS, Airflow
Roles and Responsibilities:
- Design and implement end-to-end data pipelines using Python and Airflow, ensuring efficient scheduling, orchestration, and dependency management for complex workflows.
- Develop high-performance SQL queries for data extraction, transformation, and reporting, focusing on query optimization and scalability across large datasets.
- Automate and modularize ETL processes leveraging Python scripting and reusable Airflow DAG components, adhering to best practices in software engineering and data governance.
- Work with AWS cloud services (S3, Glue, Redshift, Aurora RDS, CloudWatch) to enable scalable, secure, and cloud-native data processing and storage solutions.
- Integrate and transform data from multiple sources such as Alteryx, PostgreSQL, Oracle, and PL/SQL into unified data models to support analytics and reporting needs.
- Oversee the work of junior data engineers and mentor them.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, and redesigning infrastructure for greater scalability.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
- Collaborate with stakeholders including Product, Business Analysts, Data Science, and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Ensure data security and compliance across national boundaries through multiple data centers and AWS regions.
Skills Required:
- Numpy
- Pandas
- Python
- SQL
- Data Engineering
- Good to have - AWS (S3, Glue, Redshift, Aurora RDS, CloudWatch)
- Good to have - Airflow (DAG creation, orchestration, workflow management)
- Databricks (AWS environment)
- ETL Development & Automation
- Query Optimization & Performance Tuning
- Data Pipeline Design & Monitoring
- Data Integration (Alteryx, PostgreSQL, Oracle, PL/SQL)
- Data Security & Compliance Management
- Cloud-Native Data Processing Solutions
- Process Improvement & Automation Skills
- Analytics Tool Development
- Stakeholder Collaboration (Product, BA, Data Science, Design teams)
- Mentoring & Leadership of Junior Engineers
Education: Bachelor's degree in any STEM-based field