Data Engineer

Stier Solutions Inc
Alpharetta, GA, USA

Description

Job Duties:

Design, develop, and optimize scalable, robust ETL data pipelines using Azure Databricks and Python to enable efficient extraction, transformation, and loading of large-scale structured and semi-structured datasets from various sources into cloud-based data lakes and analytics platforms
Build and maintain automated, modular ETL workflows within Azure to support batch and near real-time data integration, leveraging technologies such as Azure Data Factory, Databricks notebooks, and Delta Lake for high-performance data processing and storage.
Implement data solutions aligned with enterprise data strategies.
Develop and implement best practices for data ingestion and processing from external systems, including integration with SAP systems using middleware solutions such as MuleSoft for seamless data transfer and harmonization.
Design and optimize backend data lake architectures to support analytics and reporting requirements, ensuring data lineage, traceability, and reusability across multiple downstream applications.
Implement scalable data transformation and automation processes in Databricks using PySpark and SQL, ensuring performance optimization through effective use of cluster configurations, partitioning strategies, and caching techniques.
Utilize Azure services including Azure Data Lake Storage, Azure Data Factory, Azure Event Hubs, and Azure Key Vault in combination with Databricks to create secure, scalable, and compliant data ecosystems.
Ensure data security and privacy compliance, particularly with PHI/PII in healthcare datasets, by applying techniques such as encryption, tokenization, and role-based access control within Azure and Databricks.
Establish monitoring, alerting, and error-handling mechanisms using tools like Azure Monitor, Log Analytics, and custom Python scripts to track pipeline performance, failures, and metrics in real-time.
Participate in code reviews, architectural discussions, and agile ceremonies, contributing to technical decision-making, performance tuning, and continuous improvement of data engineering practices and processes.

All the responsibilities mentioned above are in line with the professional background and requires an absolute minimum of a Bachelor’s degree in computer science, computer information systems, information technology, or a combination of education and experience equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned subjects.

Key Skills

Azure Databricks Python Azure Data Factory Databricks Pyspark Sql

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: Today
Experience: 5+ years of experience
Availability: Remote
Openings: 2
Category: Data Engineer
Tenure: Flexible Position