Description

Key Responsibilities

  • Familiarity with SQL and relational databases (PostgreSQL MySQL etc.).
  • Understanding of ETL processes and data pipelines.
  • Ability to write optimized Python code with efficient memory and CPU usage.
  • Good problem-solving skills and attention to detail.
  • Knowledge of distributed computing principles and performance tuning.

Required Skills

  • 6+ years of experience in Python development.
  • Strong knowledge of Pandas NumPy and other data manipulation libraries.
  • Experience with data cleaning transformation and visualization using Pandas.
  • Hands-on experience working with large datasets and performance tuning in Pandas.
  • Experience with version control systems like Git.
  • Strong hands-on experience with HDFS Hive Spark HBase and YARN.
  • Experience writing optimized SQL queries for Big Data analytics.
  • Understanding of Kafka Flume or Sqoop for data ingestion.
  • Understanding of Autosys Scheduler tool


 

Education

Any Gradute