Description

What you'll do

Write clean Python code to extract data from websites, ensuring efficiency, accuracy, and adherence to best practices.

Handle structured and unstructured data from various sources, cleaning, and transforming it into usable formats.

Identify and resolve issues related to website changes, access restrictions, and performance bottlenecks.

Work with other developers, data scientists, and stakeholders to understand data requirements and ensure proper documentation of scraping processes.

Continuously learn about new tools and techniques for web scraping and adapting to changes in website structures.

Implement testing procedures for Python-based web scraping scripts.

Adhere to web scraping best practices and legal standards to avoid issues like CAPTCHAs and IP blocking.

What experience do you need

Bachelor’s degree in Computer Science, Software Engineering, Information Technology, or a related subject.  

1-2 years of commercial experience in Python coding and scripting.

1+ years of experience in code development using JavaScript, HTML, and CSS, along with HTML structure knowledge for entity extraction.

Any experience in web scraping, or conceptual understanding of it will be a plus.

English proficiency B1+ or above. 

What could set you apart

Web crawling/scraping experience.

Proficiency in Google Cloud Platform (GCP) services or equivalent cloud platforms.

Network traffic understanding or experience.

Proficiency in Git and experience with both relational or non-relational databases.

Experience working with diverse data sources and formats.

Familiarity with CI/CD pipelines.

Education

Any Graduate