Description

Python, SQL, PySpark, Pandas, OpenShift, EKS, ECS, Databricks
Job Description: 
>> Create and set best practices for data ingestion, integration, and access patterns to support both real-time and batch-based consumer data needs
>> Assist with design and lead development on scalable, high-performance data architecture solutions that supports both the consumer side of the business as well as analytic use cases
>> Create comprehensive documentation for design, and processes to support ongoing maintenance and knowledge sharing for both GMP and non-GMP solutions
>> Drive continuous data transformation to minimize technical debt
>> Responsible for creation of test protocols / test scripts and other validation deliverables
>> Provide technical support to local end users on Data pipelines and Advanced Analytics Solutions developed
Requirements: 
>> Demonstrated experience in designing and implementing complex data systems from the ground up
>> Strong experience with programming languages, such as Python, SQL & Spark
>> Experience with building batch and streaming pipelines using complex SQL, PySpark, Pandas, and similar frameworks
>> Develop, refine, and optimize Advanced Analytics Solutions using machine learning models to extract insights from complex data sources
>> Transform data using SQL, NoSQL, and Python. Visualizing data using a diverse tool set including but not limited to Python and R
>> Experience with cloud services in AWS and/or Microsoft Azure

Education

Any Graduate