• Lead the architecture, design, delivery & deployment of core data platforms, data warehouse and data modeling needs.
• Exceptional understanding on various data topics w.r.t. data engineering i.e. building data pipelines, data modeling , data warehousing etc.
• Understanding and experience with cloud software development. Specifically AWS Cloud.
• Contribute to the design and execution of data governance, data quality frameworks.
• Have a passion and attention to detail for all aspects of data from ingestion, validation/quality, transformation, modeling, storage etc.
• Interface with various teams from product, laboratory, web services, data science etc.
Minimum Requirements:
• B.S. / M.S. in a quantitative field (e.g. Computer Science, Engineering, Mathematics, Physics, Computational Biology) with at least 6 years of related industry experience, or Ph.D. with at least 4 years of related industry experience
• Substantial experience in architecting and delivering secure, scalable cloud-based data warehouses / data lakes on AWS, Azure, or GCP
• Exceptional experience with data modeling principles , patterns and industry trends.
• Very comfortable in designing and reorganizing facts and dimensions tables, complex data models, SCDs, etc.
• Solid object-oriented and/or functional programming experience, specifically in Python and GO.
• Expert with data pipelining and workflow engines, like Apache Airflow, Spark etc., and proven ability to choose the correct frameworks as well as tools depending on the requirements.
• Experience with provisioning on AWS Cloud, e.g. with Terraform or cloudformation. Leveraging CI in a cloud environment for automation.
• In depth Experience with relational databases, query authoring, and performance tuning.
• Ability to take a high-level requirement and decompose that into clear engineering objectives, which can be further evolved into detailed specifications.
• High emotional quotient to work with potential ambiguity, ask the right questions and engage to drive resolution from requirements to solutions.
The following are highly welcome:
• Proven track record of building and operating scalable data infrastructure, managing data models for hundreds to thousands of tables.
• Experience with various data products, involvement in build vs buy decisions, designing a solution with limited resources and/or timelines .
• Experience with DevOps, e.g. CI/CD pipelines, containerized deployment, infrastructure as code, Terraform.
• Experience building microservices and web applications.
• Experience with supporting data science / machine learning data pipelines
Any Gradute