Responsibilities
• Architecting and implementing robust, high-performance data pipelines across various data sources (databases, APIs, web services) to efficiently process and transform large datasets.
• Enable and ensure data collection from various sources and loading it to a data lake, data warehouse, or any other form of storage.
• Proficiently utilizing cloud platforms Snowflake, AWS, GCP to build scalable data processing solutions, including managing data lakes and data warehouses.
• Developing and implementing data quality checks and monitoring systems to ensure data accuracy and reliability throughout the data pipeline.
• Contribute to data documentation and governance strategies, including data access controls, data lineage tracking, and data retention policies.
• Collaborating closely with data analysts, data scientists, and business stakeholders to understand data needs, translate requirements into technical solutions, and deliver actionable insights.
• Excellent analytical and problem-solving skills, with the ability to work independently and in a team environment.
Qualifications:
• 7+ years of experience with data warehouse technical architectures, ETL/ ELT, reporting/analytic tools, and scripting.
• Extensive knowledge and understanding of data modeling, schema design, and data lakes.
• 5+ years of data modeling experience and proficiency in writing advanced SQL and query performance tuning with on Snowflake in addition to Oracle, and Columnar Databases SQL optimization experience.
• Experience with AWS services including S3, Lambda, Data-pipeline, and other data technologies.
• Experience implementing Machine Learning algorithms for data quality, anomaly detection and continuous monitoring, etc.
Required skills and experience:
• Strong proficiency in Python, SQL for data manipulation and processing.
• Experience with data warehouse solutions for Snowflake, BigQuery, Databricks.
• Ability to design and implement efficient data models for data lakes and warehouses.
• Familiarity with CI/CD pipelines and automation tools to streamline data engineering workflows
• Deep understanding of principles in data warehousing and cloud architecture for building very efficient and scalable data systems.
• Experience with Apache Airflow and/or AWS MWAA
• Experience with Snowflake’s distinctive features, including multi-cluster architecture and shareable data features.
• Expertise in distributed processing frameworks like Apache Spark, or other big data technologies is a plus.
Requirements:
• Bachelor’s degree in computer science, Information Technology, or a related field.
• Proven experience as an ETL Engineer or similar role in data engineering.
• Strong understanding of ETL concepts, tools, and best practices.
• Proficiency in programming languages such as SQL, Python.
• Experience with ETL tools/platforms Databricks, IBM DataStage is a major plus.
• Knowledge of data warehousing concepts and architecture.
• Ability to work collaboratively in a team and manage multiple tasks effectively
Bachelor's degree