Must have skills:
5+ years of hands-on experience in implementing ETL & Data Quality automation testing
Strong hands on experience using SQL, Python, PySpark, Airflow
Cloud implementation experience on GCP services - BigQuery, Dataproc, Google cloud storage, Composer, Looker OR equivalent services on AWS / Azure
Handling File formats: csv, orc, parquet, JSON, xml
Data profiling skills using Python libraries viz. pandas, numpy, scipy, matplotlib, plotly, seaborn
Advantage to have experience on:
Implementation of Data Quality Validations on Data Lake and Data Warehouse
Data profiling and data science libraries viz., Great Expectations, ydata-profiling, lux, DataProfiler, scikit-learn
Other technologies like; Druid, HIVE SQL, HDFS, Flink, NoSQL like MongoDB or Cassandra
Data consumption via APIs
Logging, Monitoring, Notification tools like - new relic, Grafana, Prometheus
DevOps, CI/CD using Git, Jenkins
Telecom domain knowledge
Any Graduate