Description

  • Develop and maintain data pipelines which will move from one system to another (MS SQL, Snowflake, email archives) to ensure data quality and integrity
  • Write efficient Python scripts to automate data processing, data cleaning, and data visualization tasks
  • Analyze large datasets to identify trends, opportunities, and challenges in retail operations
  • Optimize data processes and SQL queries to overcome challenges with massive datasets (billions of rows)
  • Utilize Large Language models (LLMs) via APIs to build text classifiers, extract information from email conversations, and perform other natural language processing tasks
  • Collaborate with data scientists to develop predictive models and identify opportunities for model improvement
  • Effectively communicate complex data insights and recommendations to non-technical stakeholders

 

The Must-Haves - 

  • Associate degree (or higher) in Computer Science, Mathematics, Statistics, Physics or a related field
  • 2+ years of experience as a Data Analyst, Data Engineer, or similar role
  • Strong SQL skills, with experience in querying and manipulating large datasets
  • Proficiency in Python, with experience in data analysis and automation
  • Experience in Airflow or similar open-source workflow orchestrator
  • Experience in Airbyte or similar open-source integration tool
  • Knowledge of machine learning algorithms and experience with implementing them in a production environment
  • Ability to work in a fast-paced environment and meet deadlines
  • Strong analytical, problem-solving, and communication skills

 

The Nice-to-Haves

  • Experience in Snowflake
  • Experience working with retail data sets like PO, Invoice, and Sales at supplier and item level
  • Experience with containerization using Docker and orchestration using Kubernetes
  • Certification in data analysis or a related field

Education

Any Graduate