Description

About the Role
We are seeking a highly self-motivated and experienced ETL Engineer with deep expertise in Databricks, Apache Spark, and cloud-based data engineering. The ideal candidate will bring strong hands-on experience building scalable data pipelines, a solid background in Python, and proven knowledge of AWS services and complex data formats like JSON and XML. Experience with Domino Data Lab is also required.

Key Responsibilities
• Design, build, and maintain scalable ETL pipelines using Databricks and Spark SQL.
• Work with large-scale structured and semi-structured data, including complex JSON and XML.
• Optimize Spark jobs for performance and scalability within the Databricks environment.
• Leverage in-depth knowledge of Databricks internals to troubleshoot and improve pipeline reliability.
• Integrate and manage data using AWS services: S3, IAM, Secrets Manager, etc.
• Build and manage ETL workflows with AWS Glue.
• Deploy and operate containerized workloads using AWS ECS and/or EKS.
• Write clean, efficient, and testable Python code to support data engineering workflows.
• Collaborate with data scientists, analysts, and engineers to meet business needs.
• Work within Domino Data Lab for model development and collaboration workflows.

Required Qualifications
• 4+ years of experience working in Databricks, including understanding of platform internals and advanced configuration.
• 5+ years of experience with Apache Spark, with expert-level skills in Spark SQL.
• Proficient in Python for ETL and data pipeline development.
• Strong hands-on experience working with complex JSON and XML structures.
• 3+ years of experience with Domino Data Lab, using the platform for data science and engineering collaboration.
• Solid working knowledge of AWS cloud services, especially:
◦ S3 for data storage and access
◦ IAM for security and permissions
◦ Secrets Manager for credential management
◦ AWS Glue for orchestrating and running ETL jobs
◦ ECS/EKS for containerized workload deployment
• Excellent troubleshooting and debugging skills for large-scale data systems.
• Self-motivated, proactive, and able to work independently with minimal supervision.

Preferred Qualifications
• 10+ years of experience in Java development is a strong plus.
• Familiarity with CI/CD tools and version control (e.g., Git)

Education

Any Graduate