About the Role:
We are seeking a talented Data Engineer with strong hands-on experience in Databricks, Azure, Apache Spark, and Python. You will be responsible for designing, building, and optimizing scalable data pipelines and engineering solutions as workloads move from development to production. This role requires close collaboration with product management, cross-functional teams, and direct customer engagement to deliver high-impact data solutions.
Key Responsibilities:
Design, develop, and maintain scalable and sustainable data pipelines using Databricks, Azure, Apache Spark, and Python.
Optimize and monitor data workflows as they transition from development to production environments.
Collaborate with multiple stakeholders, including product managers and customers, to understand requirements and deliver tailored solutions.
Orchestrate and automate data workflows on cloud platforms, primarily Azure.
Write efficient SQL queries and perform data transformations on large and complex datasets.
Apply best practices in database engineering and design.
Leverage PySpark and Python for data processing and analytics.
Utilize APIs, containerization, and orchestration tools (such as Docker, Kubernetes) as needed.
Implement and maintain CI/CD pipelines for data engineering workflows.
(Nice to have) Integrate and work with Teradata and other data sources as required.
Required Skills & Experience:
Strong hands-on experience with Databricks and Apache Spark.
Expertise in Azure Cloud platform and Azure Data Factory.
Proficient in Python and PySpark programming.
Solid SQL skills and experience with large, complex datasets.
Experience orchestrating workloads on cloud platforms.
Ability to work in ambiguous environments and manage multiple stakeholders.
Experience working cross-functionally with product management and customers.
Familiarity with APIs, containerization, and orchestration (Docker/Kubernetes) is a plus.
Understanding of CI/CD methods for data engineering.
(Nice to have) Experience with Teradata.
Any Graduate