Description

Roles and responsibilities:

  • Lead in design, development and testing of data ingestion pipelines, perform end to end validation of ETL process for various datasets that are being ingested into the big data platform.
  • Perform data migration and conversion validation activities on different applications and platforms.
  • Provide the technical leadership on data profiling/analysis, discovery, analysis, suitability and coverage of data, and identify the various data types, formats, and data quality issues which exist within a given data source.
  • Contribute to development of transformation logic, interfaces and reports as needed to meet project requirements.
  • Participate in discussion for technical architecture, data modeling and ETL standards, collaborate with Product Managers, Architects and Senior Developers to establish the physical application framework (e.g. libraries, modules, execution environments)
  • Lead in design and develop validation framework and integrated automated test suites to validate end to end data pipeline flow, data transformation rules, and data integrity.
  • Develop tools to measure the data quality and visualize the anomaly pattern in source and processed data.
  • Assist Manager in project planning, validation strategy development
  • Provide support in User acceptance testing and production validation activities.
  • Provide technical recommendations for identifying data validation tools, recommend new technologies to improve the validation process.
  • Evaluate existing methodologies and processes and recommend improvements.
  • Work with the stakeholders, Product Management, Data and Design, Architecture teams and executives to call out issues, guide and contribute to the resolutions discussions.


Must have :

  • 8+ years of Software development and testing experience.
  • 4+ years of Working experience on tools like Spark, HBase, Hive, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.
  • 4+ years of programming experience in Scala, Java or Python
  • Experience in technical leading and mentoring the teams
  • Experience with developing and testing ETL, real-time data-processing and Analytics Application Systems.
  • Strong knowledge in Spark SQL, Scala code development in big data Hadoop environment and/or BI/DW development experiences.
  • Strong knowledge in shell scripting Experience in Web Services - API development and testing.
  • Experience with development and automated framework in a CI/CD environment.
  • Experience with cloud environments - AWS or GCP is a plus.
  • Knowledge of GIT/Jenkins and pipeline automation is a must.
  • A solid understanding of common software development practices and tools.
  • Strong analytical skills with a methodical approach to problem solving applied to the Big Data domain
  • Good organizational skills and strong written and verbal communication skills.


Nice to have :

  • Working experience on large migration Projects is a big plus.
  • Working experience on Google Cloud platform is a big plus
  • Development experience for tools and utilities for monitoring and alert set etc.
  • Familiarity with project Management and bug tracking tools, i.e., JIRA or a similar tool


 

Education

Any Graduate