Description

Description:

Position Overview

We are looking for an experienced DataHub Developer with Committer Experience to join our team and contribute to the design, development, and optimization of enterprise metadata management and data lineage solutions. The ideal candidate will have strong expertise in data cataloging, data lineage, data governance, and hands-on experience with DataHub, Spark-based frameworks, and machine learning for anomaly detection. This role demands a mix of open-source contribution, technical problem-solving, and metadata management expertise.

 

Required Qualifications

Experience:
5+ years in metadata management, data lineage, or data governance roles.
Proven track record as a committer or active contributor to the DataHub open-source project.
Technical Skills:
Proficiency in Java, Python, and REST API development.
Strong experience with Apache Spark for ETL pipeline design and custom framework development.
Expertise in metadata ingestion from systems like data lakes, databases, and ETL tools.
Hands-on experience with AWS services and cost optimization strategies.
Familiarity with machine learning techniques for anomaly detection.
Other Skills:
Strong analytical and problem-solving skills.
Excellent communication and collaboration abilities.


Preferred Qualifications

Knowledge of data governance regulations like GDPR, CCPA, or HIPAA.
Experience with infrastructure-as-code tools such as Terraform or Helm.
Familiarity with other metadata management tools like Amundsen, Collibra, or Alation.
Understanding of version control, CI/CD pipelines, and open-source development practices.
 

Education

Any Graduate