SUMMARY:
∙Develop and implement a strategic data analytics roadmap for the healthcare payer business, aligned
with overall business objectives.
∙Design and execute complex data analysis projects focused on areas like risk rating, claims
adjudication, and enrollment optimization.
∙Conduct statistical analysis and modeling to identify trends, patterns, and key insights from
healthcare payer data.
∙ Minimum 5 years of experience in healthcare payer analytics, with a proven track record of success
in leading and delivering impactful projects.
∙Strong understanding of risk adjustment methodologies (e.g., Hierarchical Condition Category (HCC)
coding) and their impact on healthcare payer reimbursement.
∙In-depth knowledge of healthcare claims and enrollment data structures and processes.
∙Proven experience utilizing big data technologies like Hadoop, Spark, or similar on cloud platforms
like AWS.
∙Proficiency in programming languages like Scala, Python, or R for data manipulation and analysis.
∙Excellent communication, presentation, and interpersonal skills with the ability to effectively
translate technical findings to a non-technical audience.
KEY DUTIES AND RESPONSIBILITIES:
∙Design, develop, and maintain robust data pipelines using Python and PySpark to process large
volumes of healthcare data efficiently in a multitenant analytics platform.
∙Collaborate with cross-functional teams to understand data requirements, implement data models,
and ensure data integrity throughout the pipeline.
∙Optimize data workflows for performance and scalability, considering factors such as data volume,
velocity, and variety.
∙Implement best practices for data ingestion, transformation, and storage in AWS services such as S3,
Glue, EMR, and Redshift.
∙Model data in relational databases (e.g., PostgreSQL, MySQL) and file-based databases to support
data processing requirements.
∙Design and implement ETL processes using Python and PySpark to extract, transform, and load data
from various sources into target databases.
∙Troubleshoot and enhance existing ETLs and processing scripts to improve efficiency and reliability of
data pipelines.
∙Develop monitoring and alerting mechanisms to proactively identify and address data quality issues
and performance bottlenecks.
EDUCATION AND EXPERIENCE:
∙Minimum of 5 years of experience in data engineering, with a focus on building and optimizing data
pipelines.
∙Expertise in Python programming and hands-on experience with PySpark for data processing and
analysis.
Proficiency in Python frameworks and libraries for scientific computing (e.g. Numpy, Pandas, SciPy,
Pytorch, Pyarrow).
∙Strong understanding of AWS services and experience in deploying data solutions on cloud platforms.
∙Experience working with healthcare data, including but not limited to eligibility, claims, payments,
and risk adjustment datasets.
∙Expertise in modeling data in relational databases (e.g., PostgreSQL, MySQL) and file-based
databases, ETL processes and data warehousing concepts.
∙Proven track record of designing, implementing, and troubleshooting ETL processes and processing
scripts using Python and PySpark.
∙Excellent problem-solving skills and the ability to work independently as well as part of a team.
∙Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field.
∙Relevant certifications in AWS or data engineering would be a plus.
Knowledge, Skills and Abilities:
∙Expertise in Python programming language for data processing and analysis.
∙Expertise in PySpark for building scalable data pipelines.
∙In-depth knowledge of AWS services such as S3, Glue, EMR, and Redshift for data storage and
processing.
∙Familiarity with relational databases (e.g., PostgreSQL, MySQL) and file-based databases for data
modeling and storage.
∙Understanding of data modeling, ETL processes, and data warehousing concepts.
∙Knowledge of best practices in data engineering and experience in optimizing data workflows for
performance and scalability.
∙Experience in healthcare data domains, including eligibility, claims, payments, and risk adjustment
datasets.
∙Up-to-date knowledge of emerging technologies and trends in data engineering.
∙Strong problem-solving skills and the ability to troubleshoot and optimize data pipelines and ETL
processes.
∙Excellent communication and collaboration skills to work effectively with cross-functional teams.
∙Proficient in designing, implementing, and maintaining data pipelines for processing large volumes of
data.
∙Ability to model data in relational and file-based databases to support data processing requirements.
∙Skill in developing monitoring and alerting mechanisms to ensure data quality and pipeline reliability.
∙Experience in deploying data solutions on cloud platforms and utilizing AWS services for data
processing.
∙Proficiency in writing efficient and maintainable code for data processing tasks.
∙Ability to stay organized, prioritize tasks, and meet project deadlines effectively.
∙Ability to work independently and in a team-oriented, collaborative environment.
∙Strong analytical skills to identify and address data quality issues and performance bottlenecks.
∙Capability to innovate and recommend solutions for continuous improvement in data engineering
processes.
∙Ability to communicate complex technical concepts to non-technical stakeholders effectively.
∙Strong attention to detail and commitment to delivering high-quality work.
∙Ability to deal with problems involving several concrete variables in standardized situations.
∙Ability to interact politely, tactfully and firmly with a wide range of people and personalities.
∙Ability to work in an environment with potential interruptions.
∙Ability to manage multiple simultaneous tasks with individual timeframes and priorities.
Bachelor's or Master's degree in Computer Science