7+ years of Big Data Platform (data lake) and data warehouse engineering experience demonstrated through prior work experiences. Preferably with Hadoop stack: HDFS, Hive, SQL, Spark, Spark Streaming, Spark SQL, HBase, Kafka, Sqoop, Atlas, Flink, Kafka, Cloudera Manager, Airflow, Impala, Hive, HBase, Tez, Hue, and a variety of source data connectors.
3+ years of hands-on experience building modern, resilient, and secure data pipelines, including movement, collection, integration, transformation of structured/unstructured data with built-in automated data controls, and built-in logging/monitoring/alerting, and pipeline orchestration managed to operational SLAs. Preferably using Airflow, DAGS, connector plugins.
1+ years of experience with Google cloud data services such as cloud storage, cloud proc, cloud flow, and Big Query.
5+ years of strong Python and other functional programming skills