We are looking for a passionate and skilled Python Data Engineer to join our Singapore Team.
The role will cover below areas:
- Collaborate with BAs to clarify requirements in a fast-paced environment.
- Develop scalable Python and PySpark scrapers integrated with existing Databricks frameworks.
- Ingest structured and unstructured data from websites (HTML, PDF, Excel, APIs, CSV, etc.).
- Build and test scrapers in Databricks (DBX), following GM design patterns and using shared libraries.
- Use ADF for orchestration and integrate with GM frameworks for logging, error handling, and translation.
- Write clean, reusable Python code for data processing, automation, and transformation.
- Monitor, debug, and maintain data pipelines to ensure reliability and fast issue resolution.
- Review technical specifications and work closely with FO Analysts for validation and clarification.
- Ensure scrapers support Japanese websites with power market data and adhere to HTML/API nuances.
- Follow IaC and AZD standards as per GM setup for deployment and infrastructure.
- Document pipelines and approaches based on existing standards; explain solutions clearly to users.
- Present work regularly to the Data Engineering Manager, wider team, and Head of Data when needed.
- Validate solutions against functional and non-functional requirements.
- Demonstrate a proactive, problem-solving, and delivery-focused mindset.
Technologies:
- Strong expertise in Python programming and SQL.
- Hands-on experience with web scraping and industry best practices.
- Familiarity with Python libraries for language translation (nice to have).
Knowledge of modern cloud-based data architectures, including Data Lakehouse on Databricks
- Experience with Databricks and Azure is highly desirable.
- Good understanding of Big Data frameworks like Spark and file formats like Parquet.
Software engineering and delivery
- Source code management e.g. Azure DevOps, Git
- Agile delivery methodologies such as SCRUM or Kanban
- Knowledge and work management tools (e.g., JIRA, Confluence)
- Certified in Data Engineering, Azure or Python