Description

 ·        Hadoop administrator provides support and maintenance and its eco-systems including HDFS, Yarn, Hive, LLAP, Druid, Impala, Spark, Kafka, HBase, Cloudera Work Bench, etc.

·        Accountable for storage, performance tuning and volume management of Hadoop clusters and MapReduce routines

·        Deploys Hadoop cluster, add and remove nodes, keep track of jobs, monitor critical parts of the cluster, configure name-node high availability, schedule and configure it and take       backups.

·        Installs and configures software, installs patches, and upgrades software as needed.

·        Capacity planning and implementation of new/upgraded hardware and software releases for storage infrastructure.

·        Involves designing, capacity arrangement, cluster set up, performance fine-tuning, monitoring, structure planning, scaling and administration

·        Communicates with other development, administrating and business teams. They include infrastructure, application, network, database, and business intelligence teams.

·        Responsible for Data Lake and Data Warehousing design and development.

·        Collaboration with various technical/non-technical resources such as infrastructure and application teams regarding project work, POCs (Proofs of Concept) and/or troubleshooting exercises.

·        Configuring Hadoop security, specifically Kerberos integration with ability to implement.

·        Creation and maintenance of job and task scheduling and administration of jobs.

·        Responsible for data movement in and out of Hadoop clusters and data ingestion using Sqoop and/or Flume

·        Review Hadoop environments and determine compliance with industry best practices and regulatory requirements.

·        Data modeling, designing and implementation of data based on recognized standards.

·        Working as a key person for Vendor escalation

·        On-call rotation is required to support 24/7 environment and is also expected to be able to work outside business hours to support corporate needs.

 

Minimum Qualifications:

·        Bachelor's degree in Information Systems, Engineering, Computer Science, or related field from an accredited university.

·        Intermediate experience in a Hadoop production environment.

·        Must have intermediate experience and expert knowledge with at least 4 of the following:

o    Hands on experience with Hadoop administration in Linux and virtual environments.

o    Well versed in installing & managing distributions of Hadoop (Cloudera).

o    Expert knowledge and hands-on experience in Hadoop ecosystem components; including HDFS, Yarn, Hive, LLAP, Druid, Impala, Spark, Kafka, HBase, Cloudera Work Bench, etc.

o    Thorough knowledge of Hadoop overall architecture.

o    Experience using and troubleshooting Open Source technologies including configuration management and deployment.

o    Data Lake and Data Warehousing design and development.

o    Experience reviewing existing DB and Hadoop infrastructure and determine areas of improvement.

o    Implementing software lifecycle methodology to ensure supported release and roadmap adherence.

o    Configuring high availability of name-nodes.

o    Scheduling and taking backups for Hadoop ecosystem.

o    Data movement in and out of Hadoop clusters. 

o    Good hands-on scripting experience in a Linux environment.

o    Experience in project management concepts, tools (MS Project) and techniques.

o    A record of working effectively with application and infrastructure teams.

·        Strong ability to organize information, manage tasks and use available tools to effectively contribute to a team and the organization

Education

Bachelor's degree