Key Responsibilities
- Design, build, and optimize end-to-end data pipelines using GCP-native services such as Dataflow, Dataproc, and Pub/Sub.
- Implement data ingestion, transformation, and processing workflows using Apache Beam, Apache Spark, and scripting in Python.
- Manage and optimize data storage using BigQuery, Cloud Storage, and Cloud SQL to ensure performance, scalability, and cost-efficiency.
- Enforce enterprise-grade data security and access controls using GCP IAM and Cloud Security Command Center.
Required Skills and Qualifications
- 12+ years of overall IT experience, with deep specialization in data engineering.
- 8+ years of hands-on experience designing, building, and maintaining data pipelines in enterprise environments.
- 5+ years of recent experience working with Google Cloud Platform (GCP)—specifically within a major U.S. bank or brokerage firm (required, no exceptions).
- Strong expertise in:
-
- GCP services: Dataflow, Dataproc, Pub/Sub, BigQuery, Cloud Storage, Cloud SQL.
-
- Data processing frameworks: Apache Beam and Apache Spark.
-
- Scripting and automation: Advanced proficiency in Python and SQL for data manipulation, transformation, and querying.
- Proven experience implementing GCP IAM policies and managing data access/security at scale.
- Demonstrated ability to ensure low-latency, high-throughput data systems through performance tuning and best practices