To be successful in this role a strong interest and knowledge within data engineering is needed, both when it comes to developing data pipelines as well as designing data models.
Technical skills and core competencies
Strong understanding of Data Architecture and models and experience leading data driven projects.
Solid expertise with strong opinions on Data Modelling paradigms such as Kimball, Inmon, Data Marts, Data Vault, Medallion etc.
Strong experience with Cloud Based data strategies and big data technologies – AWS Preferred. Ability to create backend services in Python that enables the data pipelines is required.
Demonstrated experience on designing data platforms on AWS for batch and stream processing pipelines.
Hands-on experience using AWS Managed and other big data services such as EMR, Glue, S3, Kinesis, DynamoDB, ECS is a must.
Strong understanding of working of Apache Spark is a must.
Strong understanding of various Data Lake/Lakehouse storage formats such as Delta, Iceberg, Hudi
Experience designing data lakehouse with Medallion architecture is desirable.
Solid experience in designing data pipelines for ETL with expert knowledge on ingestion, transformation, and data quality is a must
Hands-on experience in SQL is a must.
Expertise designing ETL pipelines combining Python + SQL is required.
Understanding of data manipulation libraries in python like Pandas, Polars, DuckDB is desired
Experience in designing the Data visualization with different tools such as Tableau and PowerBI is desirable.
Working knowledge of other Data Platforms on Azure, Databricks, Snowflake is desirable but not must.
Responsibilities
Participate in the design and developing features in the existing Data Warehouse.
Provide leadership in establishing connection between Engineering, product and analytics/data scientists team.