Python Data Engineer

Dabster
Amsterdam, Netherlands

Description

Job Description

Python/Pyspark

High and very high hands-on expertise with Python and as a Python data engineer
Very good expertise on Git, CI/CD Azure pipelines
Very good expertise in testing, testing frameworks, debugging tools and definitely test automation
Very good analysis skills, think of IT and business documentation, policies, data
As much as possible with a background in Finance & Risk, especially analytical models and modelling in general
Very good communication, stakeholders alignment and the ability to perform very well under pressure

Last but not least think also of the possibility to have an API layer where jobs from Databricks can be invoked. Probably with AppServices and Python.

Additionally, here we also have the partly usual, but also very specific to us, textual needs that we have:
Job Description:
Seeking a highly skilled Python Data Engineer with expertise in Finance and Risk domain to join our dynamic team. The ideal candidate will have extensive experience in analyzing business requirements, designing technical solutions, implementing complex business logic, ensuring seamless transition and optimization of activities within the migration process. The position requires strong proficiency in Python, PySpark, and Databricks, experience with analytical models, and a solid understanding of data structures and quality. The candidate will oversee end-to-end development, from requirement analysis to deployment, while ensuring robust testing and validation processes.
Key Responsibilities:

Requirement analysis:
- Analyse and interpret business documents related to Finance and Risk domain, especially corporate credit risk analytical models
- Collaborate with stakeholders to translate business needs into clear technical requirements.
Data analysis and understanding. Analyse and interpret data, understand data structures and concepts, assess data quality, identify inconsistencies, and resolve issues as needed.
Design, build, and maintain scalable data pipelines, workflows using Python, PySpark, and Databricks. Enhance the performance by leveraging the capabilities of Python and PySpark.
Testing and Validation:
- Understanding the idea, methodology, scope and target
- Conduct thorough testing and validation of implementation to ensure they meet performance and functionality standards, ensuring comprehensive test coverage
- Perform review, debugging, troubleshooting and issue resolution
Deployment and Maintenance
- Deploy solutions using Azure Pipeline
- Monitor and maintain deployed solutions
Code and Data Conversion: Convert SAS code of analytical assets to Python or PySpark, ensuring the accuracy and efficiency of the new programs. Work with SAS team on this reverse engineering.
Migration Planning: Contribute into developing a comprehensive migration plan, including timelines, resource allocation, and risk management.
Evaluates all work is completed in compliance with internal Privacy and Security Policies and Procedures and migration project practices.
Meet timelines and milestones by monitoring deliverables. Identifies, reports and helps solve potential risks and issues.
Documentation: Maintain detailed documentation of migration processes, including code changes, testing procedures, and performance metrics.
Collaboration: Work closely with business analysts, modelling team, SAS developers, data scientists, data engineers, and other stakeholders to ensure successful migration and integration of programs.
Training and Support: Provide training and support to team members on the new Databricks, Python/PySpark-based programs.

Key Skills

Python Pyspark Databricks Git Ci/cd Azure Pipelines

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: 30+ Days Ago
Experience: 5+ years of experience
Openings: 1
Category: Python Data Engineer
Tenure: Flexible Position