Description

Must have skills:

5+ years of hands-on experience in implementing ETL & Data Quality automation testing

Strong hands on experience using SQL, Python, PySpark, Airflow

Cloud implementation experience on GCP services - BigQuery, Dataproc, Google cloud storage, Composer, Looker OR equivalent services on AWS / Azure

Handling File formats: csv, orc, parquet, JSON, xml

Data profiling skills using Python libraries viz. pandas, numpy, scipy, matplotlib, plotly, seaborn

 

Advantage to have experience on:

Implementation of Data Quality Validations on Data Lake and Data Warehouse

Data profiling and data science libraries viz., Great Expectations, ydata-profiling, lux, DataProfiler, scikit-learn

Other technologies like; Druid, HIVE SQL, HDFS, Flink, NoSQL like MongoDB or Cassandra

Data consumption via APIs

Logging, Monitoring, Notification tools like - new relic, Grafana, Prometheus

DevOps, CI/CD using Git, Jenkins

Telecom domain knowledge

Education

Any Graduate