Collaborating and leading part of a cross-functional Agile team to create and enhance software for data ingestion and entity resolution platform
Expertise in application, data and infrastructure architecture disciplines
Working with large, complex data sets from a variety of sources
Participate in the rapid development of user-driven prototypes to identify technical options and inform multiple architectural approaches
Building efficient storage and search functions over structured and unstructured data
Utilizing programming languages Python, Java, Scala, Relational and NoSQL databases
Learning newer technologies for entity resolution such as Quantexa platform
Basic Qualifications
Proven track record of a minimum of 4 years in management, in a space with strong focus on large scale data processing and instrumentation.
Strong coding background, ideally in Java/ Python / Scala
Strong working knowledge of engineering best practices & big data ecosystem..
Experience in at least one big data product: Databricks, Elasticsearch, Snowflake
Experience building batch / real time data pipelines for production systems.
Experience with Relational and Non-Relational DBs like DB2, MongoDB
Experience with various data formats: Parquet, CSV, JSON, XML, Relational Data
Strong familiarity with Kafka, Spark, Hadoop, Iceberg, Airflow, Data Modeling, relational databases, columnar databases
Previous working experience in large scale distributed systems.
Strong familiarity with software engineering principles, including object-oriented and functional programming paradigms, design patterns, and code quality practices.
Excellent communication skills, with the ability to effectively collaborate with cross-functional teams and explain technical concepts to non-technical stakeholders.
Desired Qualifications
Experience with Rest based applications
Experience with Databricks/ Delta Lake
Experience with client reference data sourcing from vendors