Mandatory skills: GBQ, GCP, Python and ETL
Job Description:
Professional experience in Data Engineering using Google Big Query (GBQ) and Google Cloud Platform (GCP) data sets and building data pipelines.
- Hands on and deep experience working with Google Data Products (e.g. BigQuery, Dataflow, Dataproc, Dataprep, Cloud Composer, Airflow, DAG etc.).
- Python Programming Expert PySpark, Pandas
- Experience in Airflow (Create DAG, Configure the variables in Airflow, Scheduling)
- Big Data technologies and solutions (Spark, Hadoop, Hive, MapReduce) and multiple scripting and languages (YAML, Python).
- Experience in DBT to create the lineage in GCP. – Optional
- Worked in Dev-Sec-Ops (CICD) environment.
- Design and develop the ETL ELT framework using BigQuery, Expertise in Big Query concepts like Nested Queries, Clustering, Partitioning, etc.
- Experience in Data Integration, Data Transformation, Data Quality and Data Lineage tools.
- Should be able to automate the data load from Big Query using APIs or scripting language.
- E2E Data Engineering and Lifecycle (including non-functional requirements and operations) management.
- E2E Solution Design skills – Prototyping, Usability testing and data visualization literacy.
- Experience with SQL and NoSQL modern data stores.
Any Graduate