Big Data Engineer

Sayagyi Group Inc
Irving, TX, USA

Description

Roles and responsibilities:

Lead in design, development and testing of data ingestion pipelines, perform end to end validation of ETL process for various datasets that are being ingested into the big data platform.
Perform data migration and conversion validation activities on different applications and platforms.
Provide the technical leadership on data profiling/analysis, discovery, analysis, suitability and coverage of data, and identify the various data types, formats, and data quality issues which exist within a given data source.
Contribute to development of transformation logic, interfaces and reports as needed to meet project requirements.
Participate in discussion for technical architecture, data modeling and ETL standards, collaborate with Product Managers, Architects and Senior Developers to establish the physical application framework (e.g. libraries, modules, execution environments)
Lead in design and develop validation framework and integrated automated test suites to validate end to end data pipeline flow, data transformation rules, and data integrity.
Develop tools to measure the data quality and visualize the anomaly pattern in source and processed data.
Assist Manager in project planning, validation strategy development
Provide support in User acceptance testing and production validation activities.
Provide technical recommendations for identifying data validation tools, recommend new technologies to improve the validation process.
Evaluate existing methodologies and processes and recommend improvements.
Work with the stakeholders, Product Management, Data and Design, Architecture teams and executives to call out issues, guide and contribute to the resolutions discussions.

Must have :

8+ years of Software development and testing experience.
4+ years of Working experience on tools like Spark, HBase, Hive, Sqoop, Impala, Kafka, Flume, Oozie, MapReduce, etc.
4+ years of programming experience in Scala, Java or Python
Experience in technical leading and mentoring the teams
Experience with developing and testing ETL, real-time data-processing and Analytics Application Systems.
Strong knowledge in Spark SQL, Scala code development in big data Hadoop environment and/or BI/DW development experiences.
Strong knowledge in shell scripting Experience in Web Services - API development and testing.
Experience with development and automated framework in a CI/CD environment.
Experience with cloud environments - AWS or GCP is a plus.
Knowledge of GIT/Jenkins and pipeline automation is a must.
A solid understanding of common software development practices and tools.
Strong analytical skills with a methodical approach to problem solving applied to the Big Data domain
Good organizational skills and strong written and verbal communication skills.

Nice to have :

Working experience on large migration Projects is a big plus.
Working experience on Google Cloud platform is a big plus
Development experience for tools and utilities for monitoring and alert set etc.
Familiarity with project Management and bug tracking tools, i.e., JIRA or a similar tool

Key Skills

Spark Hbase Hive Sqoop Impala Kafka Mapreduce Scala Java Python

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: Today
Experience: 10+ years of experience
Openings: 1
Category: Big Data Engineer
Tenure: Flexible Position