Key Responsibilities:
ETL Development:
Design, develop, and maintain ETL pipelines to move, transform, and load data from multiple sources into data warehouses or data lakes.
Work with large datasets and ensure data quality and integrity throughout the ETL process.
Azure Data Factory Implementation:
Leverage Azure Data Factory to build, orchestrate, and automate complex data workflows and pipelines.
Integrate various data sources (on-premise and cloud) using Azure Data Factory, ensuring scalability and performance.
SQL Development and Data Management:
Develop complex SQL queries to extract, transform, and load data efficiently.
Optimize SQL queries for performance and ensure data consistency across systems.
Manage and maintain relational databases to ensure reliable data storage and retrieval.
Collaboration and Data Integration:
Collaborate with data scientists, analysts, and other teams to understand data needs and ensure data is structured and available for analytical purposes.
Integrate data from different sources (including external APIs, third-party applications, etc.) into centralized data storage systems.
Data Pipeline Optimization:
Continuously monitor, optimize, and troubleshoot data pipelines to ensure high performance and reliability.
Implement error handling, logging, and monitoring mechanisms to ensure smooth operations.
Documentation and Best Practices:
Create and maintain clear documentation for ETL processes, data pipelines, and data models.
Follow industry best practices in data engineering and ensure compliance with data governance and security standards.
Requirements:
Experience:
Minimum of 5 years of professional experience in data engineering, with a strong focus on ETL processes and data pipeline development.
At least 3 years of hands-on experience with Azure Data Factory.
Technical Skills:
Strong proficiency in SQL for querying and managing data from relational databases.
Expertise in designing and implementing ETL workflows using Azure Data Factory.
Experience working with large datasets and ensuring high-performance data processing.
Familiarity with cloud data services, particularly Azure-based solutions (e.g., Azure SQL Database, Azure Data Lake, Azure Blob Storage).
Knowledge of data warehousing and data lake concepts.
Soft Skills:
Excellent problem-solving and analytical skills.
Strong communication and interpersonal abilities, with the ability to work effectively with cross-functional teams.
Ability to work independently and manage multiple tasks effectively in a remote work environment.
Strong attention to detail with a focus on delivering high-quality solutions.
Mandatory skills:
Advanced proficiency in T-SQL, including complex queries, stored procedures, triggers, Functions and experience with SQL Server tools (e.g., SSMS, Profiler, SSIS).
Solid experience with ETL processes, data warehousing, and reporting services (SSRS).
Knowledge of performance tuning and query optimization techniques.
Familiarity with version control (e.g., Git) and Agile methodologies.
Experience with cloud databases (e.g., Azure SQL, AWS RDS) is an advantage.
Proficiency in Python, Java, or similar languages.
Proficiency in SQL, NoSQL, and NewSQL databases
Preferred Qualifications:
Experience with other Azure services (e.g., Azure Databricks, Azure Synapse Analytics).
Familiarity with data governance and security best practices.
Knowledge of Python, Spark, or other programming languages for data processing.
Experience with version control systems like Git.
Bachelor's degree in Computer Science