Develop and maintain data pipelines which will move from one system to another (MS SQL, Snowflake, email archives) to ensure data quality and integrity
Write efficient Python scripts to automate data processing, data cleaning, and data visualization tasks
Analyze large datasets to identify trends, opportunities, and challenges in retail operations
Optimize data processes and SQL queries to overcome challenges with massive datasets (billions of rows)
Utilize Large Language models (LLMs) via APIs to build text classifiers, extract information from email conversations, and perform other natural language processing tasks
Collaborate with data scientists to develop predictive models and identify opportunities for model improvement
Effectively communicate complex data insights and recommendations to non-technical stakeholders
The Must-Haves -
Associate degree (or higher) in Computer Science, Mathematics, Statistics, Physics or a related field
2+ years of experience as a Data Analyst, Data Engineer, or similar role
Strong SQL skills, with experience in querying and manipulating large datasets
Proficiency in Python, with experience in data analysis and automation
Experience in Airflow or similar open-source workflow orchestrator
Experience in Airbyte or similar open-source integration tool
Knowledge of machine learning algorithms and experience with implementing them in a production environment
Ability to work in a fast-paced environment and meet deadlines
Strong analytical, problem-solving, and communication skills
The Nice-to-Haves
Experience in Snowflake
Experience working with retail data sets like PO, Invoice, and Sales at supplier and item level
Experience with containerization using Docker and orchestration using Kubernetes