Description

Responsibilities

• Architecting and implementing robust, high-performance data pipelines across various data sources (databases, APIs, web services) to efficiently process and transform large datasets. 

• Enable and ensure data collection from various sources and loading it to a data lake, data warehouse, or any other form of storage.

• Proficiently utilizing cloud platforms Snowflake, AWS, GCP to build scalable data processing solutions, including managing data lakes and data warehouses. 

• Developing and implementing data quality checks and monitoring systems to ensure data accuracy and reliability throughout the data pipeline. 

• Contribute to data documentation and governance strategies, including data access controls, data lineage tracking, and data retention policies. 

• Collaborating closely with data analysts, data scientists, and business stakeholders to understand data needs, translate requirements into technical solutions, and deliver actionable insights. 

• Excellent analytical and problem-solving skills, with the ability to work independently and in a team environment.

Qualifications:

• 7+ years of experience with data warehouse technical architectures, ETL/ ELT, reporting/analytic tools, and scripting.

• Extensive knowledge and understanding of data modeling, schema design, and data lakes.

• 5+ years of data modeling experience and proficiency in writing advanced SQL and query performance tuning with on Snowflake in addition to Oracle, and Columnar Databases SQL optimization experience.

• Experience with AWS services including S3, Lambda, Data-pipeline, and other data technologies.

• Experience implementing Machine Learning algorithms for data quality, anomaly detection and continuous monitoring, etc.

Required skills and experience:

• Strong proficiency in Python, SQL for data manipulation and processing. 

• Experience with data warehouse solutions for Snowflake, BigQuery, Databricks.

• Ability to design and implement efficient data models for data lakes and warehouses.

• Familiarity with CI/CD pipelines and automation tools to streamline data engineering workflows 

• Deep understanding of principles in data warehousing and cloud architecture for building very efficient and scalable data systems.

• Experience with Apache Airflow and/or AWS MWAA

• Experience with Snowflake’s distinctive features, including multi-cluster architecture and shareable data features.

• Expertise in distributed processing frameworks like Apache Spark, or other big data technologies is a plus.

Requirements:

• Bachelor’s degree in computer science, Information Technology, or a related field.

• Proven experience as an ETL Engineer or similar role in data engineering.

• Strong understanding of ETL concepts, tools, and best practices.

• Proficiency in programming languages such as SQL, Python.

• Experience with ETL tools/platforms Databricks, IBM DataStage is a major plus.

• Knowledge of data warehousing concepts and architecture.

• Ability to work collaboratively in a team and manage multiple tasks effectively

Education

Bachelor's degree