Description

We are seeking a highly skilled Databricks Pipeline Developer with a strong background in Epic EMR (Electronic Medical Records) integration to design, develop, and optimize data pipelines that support healthcare analytics and operational reporting. The ideal candidate will have hands-on experience working with DatabricksApache Spark, and be familiar with healthcare datasets, especially those sourced from Epic EMR systems.

This role is ideal for someone who thrives in a data-driven healthcare environment, has a passion for improving patient outcomes through data, and understands the complexities of working with regulated healthcare data.

 

Key Responsibilities:

  • Design, develop, and deploy end-to-end data pipelines using Databricks and Apache Spark to extract, transform, and load (ETL) data from Epic EMR and other healthcare systems.
  • Build reusable, scalable, and secure pipelines to support analytics, dashboards, and real-time reporting.
  • Collaborate with data engineers, data analysts, clinical informaticists, and other stakeholders to define and refine data requirements.
  • Create and maintain documentation for pipeline design, data models, data dictionaries, and technical specifications.
  • Work with FHIR, HL7, Clarity, Caboodle, and other Epic data sources to ensure high-quality data integration.
  • Optimize performance of data processing jobs and tune Spark clusters in Databricks for efficiency and cost management.
  • Ensure data governance, privacy, and security requirements are met according to HIPAA and other healthcare regulations.
  • Participate in Agile development cycles, including story grooming, estimation, development, testing, and deployment.

 

Required Skills & Experience:

YearsRequirement
5+Strong experience in building data pipelines using Databricks and Apache Spark
3+Hands-on experience working with Epic EMR data (Clarity, Caboodle, FHIR APIs, HL7, etc.)
5+Proficiency in PySpark or Scala, SQL, and notebook development in Databricks
3+Experience with healthcare data formats, terminology (ICD, CPT, LOINC, etc.), and compliance
2+Experience with cloud platforms (AzureAWS, or GCP) for data pipeline deployments
2+Familiarity with DevOps practices, CI/CD, and version control (Git, Azure DevOps)
--Strong understanding of ETL frameworks, data modeling, and performance optimization

 

Preferred Qualifications:

  • Epic certification (Clarity, Caboodle, or Bridges) is a strong plus
  • Experience with Delta LakeUnity Catalog, or Databricks SQL
  • Knowledge of real-time streaming data pipelines using Structured Streaming
  • Experience with Databricks Lakehouse architecture
  • Familiarity with dbt (data build tool) or other transformation frameworks
  • Prior experience working in a HIPAA-regulated environment
  • Strong communication and documentation skills

Education

Any Gradute