Role & Responsibilities:
In-depth knowledge and experience of GCP Data services : Bigquery / Dataproc / Composer / Pubsub / Dataflow / GCS/ BigTable
Must have proficient experience in GCP Databases : Bigtable / Spanner / CloudSQL / AlloyDB
Solid understanding of relational database concepts and technologies such as SQL, MySQL, PostgreSQL, or Oracle.
Hands-on experience with other cloud platforms and services such as AWS RDS, or Azure SQL Database.
Experience with NoSQL databases such as MongoDB, Scylla, Cassandra, or DynamoDB is a plus.
Familiarity with database performance tuning, optimization, and troubleshooting techniques is a plus
Strong working experience on one or more of the Big Data/Hadoop distributions (or ecosystems) like, Cloudera/Hortonworks, MapR, Azure HDInsight, IBM Open platform, Kafka, Hive, Spark etc.
Good understanding of the following AWS Data services, Redshift, RDS, Athena or SQS/Kinesis
Good understanding of Native and external tables, with different file formats : Avro, ORC, Parquet
CI/CD pipelines for data workloads using Cloud Build, Artifact Registry, Terraform
Data governance solutioning using GCP governance tooling (Dataplex, Data Catalog)
Skills required:
GCP Dataflow, Pub/Sub, Cloud Composer, Cloud Workflow, BigQuery, Cloud Run, CloudBuild
Must have: programming knowledge and willingness to be hands-on - Python, Java
Specialization on streaming such as PubSub or Kafka or equivalent
Good to have:
Experience in BigQuery or Presto or Equivalent
Experience with open source ecosystem and distributions such as Hadoop, Spark, Cloudera/Hortonworks and frameworks and tech such as SPARK, Oozie, Kafka, HBASE
Understanding and experience with NoSQL databases such as HBASE, MongoDB, Cassandra
Knowledge of cloud databases such as Spanner, BigTable, Cloud SQL, DB migrations
Good to have GCP Data engineering Certification
Any Graduate