Description

strong experience in Python or Java and hands-on expertise in Databricks.
About the Role


Key Responsibilities
Design, develop, and maintain ETL/ELT pipelines for structured and unstructured data.
Write efficient, maintainable code in Python, Java, or other modern programming languages.
Work with Apache Spark (preferably Databricks) to process large datasets.
Integrate data from multiple sources, ensuring data quality and consistency.
Collaborate with data scientists, analysts, and business stakeholders to deliver reliable datasets.
Optimize data workflows for performance, scalability, and cost efficiency.
Troubleshoot and resolve issues in data pipelines and infrastructure.


Required Skills & Experience
Strong programming skills in Java and/or Python (other languages a plus).
Solid understanding of data engineering concepts — ETL/ELT, data modeling, and data warehousing.
Hands-on experience with Apache Spark (preferably Databricks).
Familiarity with SQL and working with relational databases.
Experience with cloud platforms (AWS/Azure/GCP) is a plus.
Strong problem-solving skills and attention to detail.


Preferred Qualifications
Experience with Delta Lake or Iceberg.
Exposure to workflow orchestration tools (Airflow, Dagster, etc.).
Knowledge of data governance and quality frameworks.
Background in big data and distributed systems.

Education

Any Graduate