Job Description
Working in the Delivery Squad for DAP Core to lead the data engineering work to implement technical initatives
Assist Delivery manager in pulling together Requirements, Project Plans, Technical Docs, End user docs
Assisting delivery manager to produce weekly progress pack to understand technical reasons why deliverables are rebaselined and RAID log
The primary focus for this role will be to collaborate with the Product Owner on the Planview initiative, specifically the transition of DAP Core from a tier 3 service to a tier 2 service. This is a critical change, especially given the new lines of business migrating to the AWS DAP Platform. Additionally, the Data Engineer will need to support the introduction of Apache Iceberg as the standard format for datalake storage. This involves partitioning and organising data across multiple nodes, which will help distribute the workload and accelerate data processing. Apache Iceberg offers several features to optimize query performance, including columnar storage, compression techniques like predicate pushdown, and schema evolution.
The Data Engineer will also conduct a thorough review of our environments to scale down the use of pre-production environments, while enhancing code quality. This initiative is expected to contribute to a projected £500k saving over the year.
Craft:
Data Engineer
Skills:
1. AWS Services: In-depth knowledge and hands-on experience with various AWS services relevant to data engineering, such as Amazon S3, Amazon Redshift, AWS Glue, Amazon EMR, AWS Lambda, Amazon Kinesis, Amazon Athena, Amazon DynamoDB
2. Data Warehousing: Proficiency in designing, implementing, and optimising data warehousing solutions using AWS services like Amazon Redshift, including schema design, data modeling, performance tuning, and ETL/ELT processes.
3. Big Data Technologies: Familiarity with big data technologies and frameworks like Apache Hadoop, Apache Spark, Apache Kafka, and related AWS services like Amazon EMR, Amazon Kinesis, and AWS Glue for processing and analysing large-scale data sets.
4. Data Pipeline and ETL/ELT: Experience in building scalable data pipelines and performing Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) processes using AWS services like AWS Glue, AWS Data Pipeline, or custom-built solutions using AWS Lambda and other serverless technologies.
5. SQL and NoSQL Databases: Proficiency in working with SQL-based databases like Amazon Redshift, Amazon RDS, and NoSQL databases like Amazon DynamoDB, as well as knowledge of data modelling, query optimisation, and database design principles.
6. Data Integration: Ability to integrate data from various sources, both on-premises and cloud-based, using AWS services like AWS Glue, AWS Database Migration Service (DMS), or custom-built solutions.
7. Data Security and Governance: Understanding of data security best practices, compliance requirements, and privacy regulations pertaining to data engineering on AWS, including data encryption, access controls, and auditing.
8. Scripting and Programming Languages: Proficiency in scripting languages like Python, as well as experience with programming languages like Java or Scala for developing data processing applications using AWS SDKs and frameworks.
9. Monitoring and Performance Optimisation: Knowledge of monitoring tools and techniques for AWS services, including Amazon CloudWatch, AWS CloudTrail, and other third-party monitoring solutions, to ensure optimal performance, scalability, and reliability of data engineering workflows.
10. Cloud Architecture and Infrastructure: Understanding of cloud architecture principles, AWS infrastructure components, and networking concepts to design robust and scalable data engineering solutions.
11. Collaboration and Communication: Strong collaboration and communication skills to work effectively with cross-functional teams, stakeholders, and data scientists/analysts, and to articulate complex technical concepts to non-technical stakeholders.
12. Continuous Learning: Eagerness to stay updated with the latest AWS services, technologies, and trends in the data engineering domain, and a commitment to continuous learning and professional development.
Any Graduate