Key Responsibilities:
• Design and implement robust, scalable, and efficient data pipelines using GCP services such as BigQuery, Dataflow, Pub/Sub, Cloud Storage, and more.
• Build and maintain data models and ETL processes to support data analytics and reporting needs.
• Optimize data storage and retrieval for performance and cost-effectiveness.
• Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver solutions.
• Monitor and troubleshoot production pipelines to ensure data quality and system reliability.
• Implement security best practices to ensure data integrity and compliance with regulations.
• Create and maintain comprehensive documentation for data workflows, architecture, and systems.
Qualifications:
Must-Have Skills:
• 5+ years of experience in data engineering or related fields.
• Proficiency with GCP services, including but not limited to BigQuery, Dataflow, Cloud Storage, Cloud Composer (Airflow), and Pub/Sub.
• Strong programming skills in Python and Java.
• Hands-on experience with SQL for querying and transforming data.
• Knowledge of data modeling, data warehousing, and building scalable ETL/ELT pipelines.
• Familiarity with CI/CD pipelines for deploying and managing data workflows.
• Solid understanding of distributed computing and cloud-based architecture.
Preferred Skills:
• Experience with other cloud platforms (AWS or Azure) is a plus.
• Knowledge of Spark, Hadoop, or other big data technologies.
• Familiarity with Kubernetes or containerised applications.
• Background in machine learning or data science is advantageous.
________________________________________
Soft Skills:
• Excellent problem-solving skills and a proactive approach to challenges.
• Strong communication skills to explain technical concepts to non-technical stakeholders.
• Ability to work in a fast-paced environment with changing priorities.
• Team-oriented mindset with a passion for knowledge sharing.
Bachelor's degree in Computer Science