Description

MUST HAVES:
5+ yrs of Experience with AWS capabilities; Glue, DynamoDB, Lambda, Redshift, elastic search on large scale / enterprise level deployments
5+ yrs of experience with API and Strong knowledge of Data Integration (e.g. Streaming, Batch, Error and Replay)
5+ yrs of experience with Python, PySpark and Apache Spark

Skills:
Software development experience with Python, PySpark and Apache Spark
Proficiency in SQL, relational and non-relational databases, query optimization and data modelling.
Strong knowledge of Data Integration (e.g. Streaming, Batch, Error and Replay) and data analysis techniques.
Experience with GitHub, Jenkins, and Terraform.
Experience with Teradata (Vantage) or any RDBMS system/ ETL Tools
Good experience on designing and developing data pipelines for data ingestion and transformation using Spark.
Excellent in trouble shooting the performance and data skew issues.
Working knowledge on the implementation of data lake ETL using AWS glue, Databricks etc.
Experience with large scale distributed relational and NoSQL database systems.
Excellent communication skills
Provide recommendations and design optimal configurations for large-scale deployments.

Required Skills: 
Python
SQL
AWS
Healthcare
Data Analytics
GIT
 

Education

Any Graduate