Create and maintain optimal data pipeline architecture, Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability etc.
Work with a team of product and program managers, engineering leaders and business leaders to build data architectures and platforms to support business.
Design, develop, and operate high-scalable, high-performance, low-cost, and accurate data pipelines in distributed data processing platforms.
Recognize and adopt best practices in data processing, reporting and analysis: data integrity, test design, analysis, validation, and documentation.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Required Skills
Expertise in SQL, SQL tuning, and ETL development.
Ability to handle multiple tasks and priorities.
Ability to collaborate with team members to meet project deadlines and milestones.
Ability to work well remotely, responsive, communicates well and can work in a fast paced environment.
Quick learners are able to quickly understand complex business.
Proficiency with Linux/Unix systems.
Required Experience
(Sr-level) Strong Programming experience with object-oriented/object function scripting languages: Scala.
5+ years of experience (Mid-level) Experience with big data tools: Hadoop, Apache Spark etc.
1+ years of strong technical Experience with AWS cloud services and DevOps engineering: S3, IAM, EC2, EMR, RDS, Redshift, Cloudwatch with Docker, Kubernetes, GitHub, Jenkins, CICD.
Experience with stream-processing systems: Python, Spark-Streaming, etc. (Nice to have).
1+ Years of experience with relational SQL or, Snowflake and NoSQL databases, like Postgres and Cassandra.
Education Requirements
Bachelor’s Degree in Computer Science, Computer Engineering or a closely related field.