Description

Collaborating and leading part of a cross-functional Agile team to create and enhance software for data ingestion and entity resolution platform

Expertise in application, data and infrastructure architecture disciplines

Working with large, complex data sets from a variety of sources

Participate in the rapid development of user driven prototypes to identify technical options and inform multiple architectural approaches

Building efficient storage and search functions over structured and unstructured data

Utilizing programming languages Python, Java, Scala, Relational and NoSQL databases

Learning newer technologies for entity resolution such as Quantexa platform

Proven track record of a minimum of 4 years in management, in a space with strong focus on large scale data processing and instrumentation.

Strong coding background, ideally in Java/ Python / Scala

Strong working knowledge of engineering best practices & big data ecosystem..

Experience in at least one big data product: Databricks, Elasticsearch, Snowflake

Experience building batch / real time data pipelines for production systems.

Experience with Relational and Non-Relational DBs like DB2, MongoDB

Experience with various data formats: Parquet, CSV, JSON, XML, Relational Data

Strong familiarity with Kafka, Spark, Hadoop, Iceberg, Airflow, Data Modeling, relational databases, columnar databases

Previous working experience in large scale distributed systems.

Strong familiarity with software engineering principles, including object-oriented and functional programming paradigms, design patterns, and code quality practices.

Excellent communication skills, with the ability to effectively collaborate with cross-functional teams and explain technical concepts to non-technical stakeholders.

Experience with Rest based applications

Experience with Databricks/ Delta Lake

Experience with client reference data sourcing from vendors

Big Data Products - Databricks, Elasticsearch, Snowflake

Coding - Java/Python/Scala

Relational and Non-Relational DBs like DB2, MongoDB

Kafka, Spark, Airflow, Hadoop


 


 

Education

Any Gradute