Design and implement scalable data architectures on GCP, including data lakes, warehouses, and real-time processing systems.
Develop and optimize ETL/ELT pipelines using Python, SQL, and streaming technologies like Kafka and Apache Beam.
Model and manage knowledge graphs using Neo4j and GraphQL to capture complex relationships and power intelligent applications.
Collaborate cross-functionally with data scientists, ML engineers, and business stakeholders in Agile teams to deliver data-driven solutions.
Automate infrastructure and deployments using Terraform, Jenkins, and CI/CD best practices to ensure reliability and scalability.
Ensure data quality, governance, and observability through robust validation, monitoring, and documentation practices.
Your Skills and Experience
Bachelor’s degree in Computer Science, Engineering, or a related technical field.
8-15 years of experience in data engineering, with strong expertise in cloud-native architectures (preferably GCP) and distributed systems.
Proficiency in Python and SQL, with a proven track record of building and optimizing large-scale ETL/ELT pipelines and real-time data streaming solutions (e.g., Kafka, Apache Beam).
Hands-on experience with knowledge graphs, especially Neo4j, and familiarity with GraphQL for modeling complex data relationships.
Strong understanding of MLOps practices, including integrating data pipelines with ML workflows and deploying infrastructure using tools like Terraform and Jenkins.
Excellent communication and presentation skills, with the ability to collaborate across technical and non-technical teams in Agile environments.
Retail or e-commerce experience is a strong plus, particularly in areas like customer segmentation, personalization, or supply chain analytics
Bachelor's degree