Job Description
Extensive experience in GCP, DBT, Python, Java, SQL, and Terraform
Key Responsibilities
- Design, develop, and maintain end-to-end data pipelines to process large-scale data efficiently.
- Build and optimize data transformation processes using DBT, SQL, and Python.
- Leverage Google Cloud Platform (GCP) services including BigQuery, Dataflow, Cloud Composer, and Cloud Functions to develop scalable data solutions..
- Manage containerized deployments using Docker and handle secure credential management with HashiCorp Vault.
- Ensure data reliability, scalability, and performance optimization while handling structured and unstructured data.
- Collaborate with data scientists, analysts, and software engineers to support data-driven decision-making.
- Drive best practices for data governance, security, and compliance.
- Monitor and troubleshoot data pipelines, ensuring system reliability and efficiency.
- Implement CI/CD pipelines (Codefresh, GitHub Actions) and infrastructure automation with Terraform
Required Skills & Experience
- Extensive experience with GCP services: BigQuery, Dataflow, Cloud Composer, Cloud Functions
- Strong expertise in DBT for data transformation and modelling
- Proficiency in programming languages: Python, Java, SQL
- Infrastructure as Code (IaC) experience: Terraform
- Experience in CI/CD tools: Codefresh, GitHub Actions
- Knowledge of containerization and DevOps tools: Docker, HashiCorp Vault
- Hands-on experience handling large volumes of data, optimizing transformation, and improving performance
- End-to-end experience in building scalable data applications and pipelines
- Proven ability to work independently and manage multiple priorities as an individual contributor
Bonus Skills
Experience with AI, Large Language Models (LLMs), and Machine Learning
Familiarity with ML frameworks (TensorFlow, PyTorch, Scikit-learn)
Knowledge of real-time data streaming technologies