Description

Key Responsibilities:

Design and architect end-to-end data solutions using GCP services like BigQuery, Dataflow, Pub/Sub, Cloud Storage, and Dataproc.

Build and optimize scalable ETL/ELT pipelines to support data ingestion, transformation, and processing.

Collaborate with data scientists, analysts, and business stakeholders to understand data needs and deliver reliable data solutions.

Ensure data governance, quality, security, and compliance across the architecture.

Implement data modeling best practices for structured and unstructured data sources.

Automate infrastructure using Terraform or Deployment Manager for consistent and repeatable deployments.

Monitor and troubleshoot data pipeline issues and optimize performance and cost-efficiency.

Lead the evaluation and adoption of new tools, frameworks, and practices in the GCP ecosystem.

Mentor junior data engineers and guide architectural decisions.

Required Qualifications:

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.

6+ years of experience in data engineering with 2+ years in a cloud architecture role.

Deep expertise in GCP services including BigQuery, Dataflow, Pub/Sub, Cloud Functions, and GCS.

Strong proficiency in SQL, Python, and/or Java for data pipeline development.

Experience with Apache Beam, Apache Spark, or other distributed data processing frameworks.

Solid understanding of data warehousing, data lakes, and streaming/batch processing paradigms.

Hands-on experience with CI/CD tools and Infrastructure as Code (IaC) like Terraform.

Strong knowledge of data security and compliance practices (e.g., GDPR, HIPAA)

Education

Bachelor's or Master's degrees