About the Role:
We are looking for a highly skilled and motivated Senior Data Engineer to join our dynamic team. In this role, you will design, develop, and maintain scalable, reliable data pipelines and solutions on the AWS cloud platform. Your expertise in big data technologies and cloud data services will be instrumental in enabling data-driven decision-making across the organization. The ideal candidate should demonstrate strong leadership, a deep passion for data, and a consistent track record of delivering robust data engineering solutions.
Key Responsibilities:
- Design, build, and maintain scalable ETL/ELT data pipelines using PySpark, SQL, and AWS Glue.
- Optimize data models, implement effective partitioning strategies, and perform performance tuning on large datasets.
- Manage and deploy data solutions on AWS services including Glue, EMR, S3, Lambda, Redshift, and Athena.
- Use Terraform, CloudFormation, or CDK for infrastructure-as-code development and management.
- Apply distributed computing concepts using Apache Spark (PySpark) and Hadoop to process and analyze big data.
- Develop CI/CD pipelines to support automated and efficient data engineering workflows.
- Ensure data security, governance, and compliance by implementing IAM, KMS, and Lake Formation.
- Work with additional data platforms such as Snowflake, Databricks, or Apache Airflow.
- Implement real-time data streaming solutions using Kinesis, Kafka, or Flink.
- Write clean, maintainable code in Python, SQL, and Shell Scripting.
- Collaborate closely with business stakeholders to gather requirements and deliver impactful data solutions.
- Provide technical leadership and mentorship to junior engineers, promoting best practices and continuous improvement.