A Data Engineer is required to design and implement a cloud-native data processing and API integration system. The solution will ingest identity data from upstream sources, detect record-level changes, and synchronize user metadata to a downstream system via API. This is a hands-on role focused on scalable data handling, automation, and fault-tolerant service deployment within GCP.
- Solution Design & Development: Build modular Python applications that process identity data files or APIs and sync them to target platforms.
- Data Staging & Processing: Stage identity metadata in BigQuery using defined schemas and implement change detection logic (create/update/delete).
- API Integration: Design and implement logic to call RESTful APIs to maintain target user repositories (e.G., user attributes, roles).
- Workflow Orchestration: Use GCP Pub/Sub, Composer, and/or Cloud Run to manage asynchronous workflows and ensure event-driven processing.
- Infrastructure as Code: Deploy and manage services using Terraform with a focus on security, idempotency, and configuration as code.
- Observability & Resilience: Implement logging, retry logic, and incident handling to ensure system reliability and traceability.
- Testing & Validation: Build automated test coverage for critical processing logic and API interactions.
Required Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or equivalent work experience
- 6+ years in backend development or data engineering roles focused on identity, security, or metadata systems
- Strong Python engineering for data processing and backend development
- Advanced experience with GCP services: BigQuery, Cloud Run, Cloud Functions, Cloud Composer, Pub/Sub, Cloud Storage, Secret Manager, Cloud Scheduler
- Experience interacting with REST APIs, including OAuth2 or token-based authentication
- Terraform for cloud infrastructure automation
- Proficiency with SQL for data transformation and validation
- Strong understanding of CI/CD, containers (Docker), Git workflows
- Comfortable working with structured metadata, user roles, and directory-style data
- Able to work independently and meet delivery milestones
- Strong documentation and debugging skills
- Must adhere to enterprise security and change control practices
Preferred Qualifications:
- Experience integrating with IAM or identity systems (e.G., LDAP, Okta, custom directories)
- Background working in regulated or high-security environments
- Experience handling large-scale user datasets (millions of records)
- Familiarity with hybrid data processing (batch + streaming)
- GCP Certifications