Description

We are seeking a Senior Data Engineer with strong expertise in migrating PostgreSQL workloads to Databricks on AWS, specifically within the healthcare domain. The ideal candidate will be Databricks-certified, with a deep understanding of healthcare-specific data formats and compliance requirements. This role is critical to driving cloud modernization efforts for our healthcare clients, enabling secure, scalable, and high-performance data pipelines.
Key Responsibilities:
· Lead the migration of AWS-hosted PostgreSQL workloads to Databricks Lakehouse architecture.
· Design and implement scalable ETL/ELT pipelines using Databricks (PySpark, SQL, Delta Lake) in AWS ecosystem
· Work with complex healthcare data formats, including X12 837 claims, EBCDIC files, and other structured/unstructured formats.
· Implement data masking, profiling, and governance solutions to ensure HIPAA-compliant data handling across ingestion, transformation, and consumption layers.
· Optimize data pipelines for performance, reliability, and cost-efficiency within the AWS ecosystem.
· Collaborate with data architects, analysts, and compliance teams to enforce data security best practices.
· Drive data quality monitoring and reconciliation processes in high-volume, sensitive data environments.
Must-Have Qualifications:
· 8+ years of hands-on data engineering experience with strong focus on AWS + Databricks environments.
· Proven track record of migrating PostgreSQL workloads from AWS to modern lakehouse platforms.
· Databricks Certified Data Engineer (Associate or Professional).
· Experience with healthcare data and regulatory formats including X12 837, HL7, EBCDIC files, etc.
· Deep expertise in data masking, governance frameworks, and working in HIPAA-regulated environments.
· Familiarity and working knowledge in adhering to CI/CD processes and practices in a cloud ecosystem.
· Proficiency in PySpark, SQL, Delta Lake, and data transformation in Databricks.
· Solid understanding of data ingestion from AWS (S3, Glue, Lambda) into Databricks.
Preferred Skills:
· Familiarity with FHIR, X12 837, HL7, or other clinical data standards.
· Experience with Unity Catalog and Lakehouse governance frameworks.
· Knowledge of Agile methodologies and CI/CD practices for data pipelines.
· This role is healthcare domain-focused, and prior experience working with payers, providers, or healthcare clearinghouses is highly desirable.
 

Education

Any Graduate