Description

Job Description

Key Responsibilities:

  • Design and implement robust, scalable, and efficient data pipelines using GCP services such as BigQuery, Dataflow, Pub/Sub, Cloud Storage, and more.
  • Build and maintain data models and ETL processes to support data analytics and reporting needs.
  • Optimize data storage and retrieval for performance and cost-effectiveness.
  • Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver solutions.
  • Monitor and troubleshoot production pipelines to ensure data quality and system reliability.
  • Implement security best practices to ensure data integrity and compliance with regulations.
  • Create and maintain comprehensive documentation for data workflows, architecture, and systems.

Qualifications:
Must-Have Skills:

  • 5+ years of experience in data engineering or related fields.
  • Proficiency with GCP services, including but not limited to BigQuery, Dataflow, Cloud Storage, Cloud Composer (Airflow), and Pub/Sub.
  • Strong programming skills in Python and Java.
  • Hands-on experience with SQL for querying and transforming data.
  • Knowledge of data modeling, data warehousing, and building scalable ETL/ELT pipelines.
  • Familiarity with CI/CD pipelines for deploying and managing data workflows.
  • Solid understanding of distributed computing and cloud-based architecture.

Preferred Skills:

  • Experience with other cloud platforms (AWS or Azure) is a plus.
  • Knowledge of Spark, Hadoop, or other big data technologies.
  • Familiarity with Kubernetes or containerised applications.
  • Background in machine learning or data science is advantageous.

 

Soft Skills:

  • Excellent problem-solving skills and a proactive approach to challenges.
  • Strong communication skills to explain technical concepts to non-technical stakeholders.
  • Ability to work in a fast-paced environment with changing priorities.
  • Team-oriented mindset with a passion for knowledge sharing.

Education

Any Graduate