Description

5+ years of experience in Databricks and ETL development to design, build, and optimize robust data pipelines and ETL automation.
Extensive experience with Databricks and ETL processes for large-scale data environments.

Desired Skills & Experience -  

Familiarity with ML code generation tools (e.g., Copilot) to enhance development efficiency.

We are looking for a highly skilled and experienced Senior Software Engineer with expertise in Databricks, ETL processes along with one or more languages such as Python, particularly within a highly-regulated sector such as healthcare. As a key member of our new team, you will play a pivotal role in shaping and driving the development of our data infrastructure. You will work on data-heavy projects, leveraging Azure technologies and Databricks for large-scale data processing, transformation, and automation. Your experience in building and maintaining robust backend architectures and working with healthcare data (claims, pharma, etc.) will be essential to the success of our projects.

Required Skills -    

Python Development, Databricks, ETL Automation, Azure Cloud

Databricks, ETL Automation, Azure Cloud, Python Development

Job Duties -    

5+ years of experience in Databricks and ETL development to design, build, and optimize robust data pipelines and ETL automation.
Extensive experience with Databricks and ETL processes for large-scale data environments.

8+ years of experience in Python development, and Databricks to design, build, and optimize robust data pipelines and ETL automation.

Collaborate effectively with data architects, DBAs, product owners, and other stakeholders to ensure alignment on project goals, requirements, and solutions.
Design, develop, and maintain scalable data pipelines in Azure using Python and Databricks, focusing on efficient ETL processes for large datasets.
Build and optimize data lakes and backend architectures, ensuring smooth data extraction, transformation, and loading (ETL) across systems.

Work with healthcare data (claims, pharma, PHI/PII), ensuring compliance with data privacy regulations and implementing security measures like tokenization.
Leverage Azure cloud technologies, particularly Databricks, to implement large-scale data processing and analytics solutions.

Integrate middleware solutions like MuleSoft for data processing between SAP and Databricks.
Move towards Infrastructure as Code (IaC) using Bicep, and reduce manual coding by utilizing Copilot for Python code generation.

Ensure infrastructure reliability and scalability, taking responsibility for reporting, monitoring, and backup system management.
Use ServiceNow for CMDB with auto-discovery and implement web application firewalls (e.g., Curva) to enhance security.
Drive data transformation and automation tasks, focusing on scalable and efficient solutions within the engineering team.

Extensive experience with Python, Databricks, and ETL processes for large-scale data environments.

Familiarity with healthcare or pharma-related data, with expertise in handling sensitive data (PHI/PII) and implementing security measures such as tokenization.
Strong background in cloud-based data architectures, middleware solutions, and DevOps principles, including the use of Bicep and ServiceNow CMDB.
Strong communication and collaboration skills to work within a dynamic, cross-functional team.

Ability to solve complex technical challenges with innovative solutions.
Strong analytical and problem-solving mindset with a focus on data optimization and scalability.
Strong communication and collaboration skills to work within a dynamic, cross-functional team.
Ability to solve complex technical challenges with innovative solutions.
Strong analytical and problem-solving mindset with a focus on data optimization and scalability

Education

Any Gradute