PySpark Developer

Work with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Analyze data (which attributes make the most sense, variable attribution)
Rewrite code in a different language.
Involve in creating Hive Internal and External tables, loading data and writing hive queries, which will run internally in map, reduce way.
Involve in Migrating the Hive queries to Impala.
Create batch analysis job prototypes using Hadoop, Pig, Oozie, Hue and Hive.
Assist with data capacity planning and node forecasting.

Senior level developer with expertise in Spark.
Expert in one of the object-oriented programming languages such as Java, Python & Linux
knowledge will be added advantage.
Good debugging skills.
Good knowledge of Oracle SQL.
Senior knowledge of Data Science Coding.
Proficiency in Computer Science fundamentals – Object oriented design, Data structures & Algorithms.

6+ years of experience with hands on design and coding on Big data, Hadoop, Spark/Scala and Python.
Senior knowledge of5+ years of recent experience with Python/ SPARK code development and Software engineering Python-pandas, R, SPARK – used the data science packages (NumPy, etc).
Having Experience In Spark & Hive queries.

Bachelor’s Degree in Computer Science, Computer Engineering or a closely related field.

Any Graduate

Back To Jobs