Design, implement, and maintain robust, scalable, and secure data pipelines on GCP.
Build data architectures that efficiently process structured and unstructured data from multiple sources including APIs, event streams, and databases.
Leverage GCP services such as BigQuery, Dataflow, Pub/Sub, and Cloud Storage to architect and deploy data solutions.
Develop ETL/ELT processes to transform raw data into clean, structured, and consumable datasets for data science and analytics teams.
Optimize existing workflows for cost, performance, and scalability across large datasets using distributed computing tools like Apache Spark and Hadoop.
Collaborate closely with data analysts, data scientists, and business stakeholders to understand data requirements and deliver high-quality solutions.
Implement and maintain data quality checks, lineage tracking, and metadata management.
Ensure compliance with data security and privacy standards throughout all data engineering operations.
Contribute to the continuous improvement of data engineering practices using Agile Skills & Qualifications :
Minimum 7 years of hands-on experience in Data Engineering, with proven success in cloud-based environments.
Bachelors degree in Computer Science, Information Technology, Engineering, or a related technical discipline.
Strong expertise in Google Cloud Platform (GCP), particularly in working with BigQuery, Dataflow, Cloud Composer, and Cloud Storage.
Proficient in SQL for data manipulation and transformation.
Solid experience with Python or Scala for data processing and scripting.
Experience with big data technologies like Apache Spark, Hadoop, and related tools.
Deep understanding of data modeling, data warehousing concepts, and best practices.
Familiarity with CI/CD pipelines, version control (Git), and orchestration tools like Airflow or Cloud Composer.
Strong analytical and problem-solving skills, with the ability to work with complex datasets and extract meaningful insights.
Excellent communication skills to collaborate effectively with cross-functional teams.
Working knowledge of data security, compliance, and privacy best to Have :
Experience with real-time data streaming using Apache Kafka or GCP Pub/Sub.
Exposure to DevOps practices and Infrastructure as Code (IaC) using tools like Terraform or Deployment Manager.
Familiarity with Agile/Scrum methodologies and collaborative project tracking tools (e.g., Jira, Confluence)