Description

Job description : 

Key Responsibilities:

Develop and maintain data pipelines using Apache Spark, Python, and SQL

Work with data warehousing systems and contribute to scalable data architecture

Apply Python libraries (pandas, numpy, scikit-learn) for data manipulation and ML workflows

Collaborate on data science projects: from use case framing to model deployment

Understand big data tools and contribute to DevOps workflows using Docker or Singularity
Required Skills:

4+ years of experience in Hadoop ecosystem

Strong knowledge of data warehousing, Spark, Python, and SQL

Solid understanding of statistics and ML algorithms

Experience with full data science lifecycle

Bonus: Exposure to Docker, Singularity, and big data platforms

Education

Any Graduate