GCP Data Engineer

Ztek Consulting Inc
Blue Ash, OH, USA

Description

Provide Technical Leadership: Offer technical leadership to ensure clarity between ongoing projects and facilitate collaboration across teams to solve complex data engineering challenges.

Build and Maintain Data Pipelines: Design, build, and maintain scalable, efficient, and reliable data pipelines to support data ingestion, transformation, and integration across diverse sources and destinations, using tools such as Kafka, Databricks, and similar toolsets.

Drive Digital Innovation: Leverage innovative technologies and approaches to modernize and extend core data assets, including SQL-based, NoSQL-based, cloud-based, and real-time streaming data platforms.

Implement Feature Engineering: Develop and manage feature engineering pipelines for machine learning workflows, utilizing tools like Vertex AI, BigQuery ML, and custom Python libraries.

Implement Automated Testing: Design and implement automated unit, integration, and performance testing frameworks to ensure data quality, reliability, and compliance with organizational standards.

Optimize Data Workflows: Optimize data workflows for performance, cost efficiency, and scalability across large datasets and complex environments.

Mentor Team Members: Mentor team members in data principles, patterns, processes, and practices to promote best practices and improve team capabilities.

Draft and Review Documentation: Draft and review architectural diagrams, interface specifications, and other design documents to ensure clear communication of data solutions and technical requirements.

Co st/Benefit Analysis: Present opportunities with cost/benefit analysis to leadership, guiding sound architectural decisions for scalable and efficient data solutions

Support flows for an ML platform in GCP and needs to be able to work with data science and understand the ML concepts in terms of requirements to be met by the dat

4+ years of professional Data Development experience.

4+ years of experience with SQL and NoSQL technologies.

3+ years of experience building and maintaining data pipelines and workflows.

Data Engineer Experience with GCP services such as Vertex AI Platform, Cloud Storage, AutoMLOps, and Dataflow

Experience developing with Python.

Experience with PySpark and Spark development.

Experience with CI/CD pipelines and processes.

Experience with automated unit, integration, and performance testing.

Experience with version control software such as Git.

Strong understanding of Agile principles (Scrum).