Description

Responsibilities :

 

  • Design, implement, and maintain robust, scalable, and secure data pipelines on GCP.
  • Build data architectures that efficiently process structured and unstructured data from multiple sources including APIs, event streams, and databases.
  • Leverage GCP services such as BigQuery, Dataflow, Pub/Sub, and Cloud Storage to architect and deploy data solutions.
  • Develop ETL/ELT processes to transform raw data into clean, structured, and consumable datasets for data science and analytics teams.
  • Optimize existing workflows for cost, performance, and scalability across large datasets using distributed computing tools like Apache Spark and Hadoop.
  • Collaborate closely with data analysts, data scientists, and business stakeholders to understand data requirements and deliver high-quality solutions.
  • Implement and maintain data quality checks, lineage tracking, and metadata management.
  • Ensure compliance with data security and privacy standards throughout all data engineering operations.
  • Contribute to the continuous improvement of data engineering practices using Agile Skills & Qualifications :
  • Minimum 7 years of hands-on experience in Data Engineering, with proven success in cloud-based environments.
  • Bachelors degree in Computer Science, Information Technology, Engineering, or a related technical discipline.
  • Strong expertise in Google Cloud Platform (GCP), particularly in working with BigQuery, Dataflow, Cloud Composer, and Cloud Storage.
  • Proficient in SQL for data manipulation and transformation.
  • Solid experience with Python or Scala for data processing and scripting.
  • Experience with big data technologies like Apache Spark, Hadoop, and related tools.
  • Deep understanding of data modeling, data warehousing concepts, and best practices.
  • Familiarity with CI/CD pipelines, version control (Git), and orchestration tools like Airflow or Cloud Composer.
  • Strong analytical and problem-solving skills, with the ability to work with complex datasets and extract meaningful insights.
  • Excellent communication skills to collaborate effectively with cross-functional teams.
  • Working knowledge of data security, compliance, and privacy best to Have :
  • Experience with real-time data streaming using Apache Kafka or GCP Pub/Sub.
  • Exposure to DevOps practices and Infrastructure as Code (IaC) using tools like Terraform or Deployment Manager.
  • Familiarity with Agile/Scrum methodologies and collaborative project tracking tools (e.g., Jira, Confluence)
     

Education

Bachelor's degree