Description

Job Description

  • Responsible for Requirement gathering, analysis, Design the Data pipeline Architecture from source to target.
  • Develop Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats.
  • Validate the source and target data and writing spark jobs using Transformations and Actions.
  • Design NiFi workflow to pick up the data from source systems to Hadoop and Amazon S3.
  • Develop Kafka producer and consumer for different publishing and subscribing to Kafka topics.
  • Develop Spark-SQL queries to Load JSON data and create Schema and loaded it into Hive Tables and handled structured data using Spark SQL.
  • Analyse datasets, performed logical analysis operations to deep dive into data, debug data quality, cleanse and transform data and create reports to share findings across the teams.

Skills

  • Hadoop ( PySpark, MapReduce, Hive), Shell scripting, Sqoop, Oozie, Zookeeper,Snowflake,Databricks,

 

Education: Looking some one with Bachelors or Equivalent in Computer Science.

Education

Bachelors

Salary

INR 00 - 00