Bachelor’s degree in Computer Science, Computer Engineering, or related technical discipline.
Worked in ETL project for 4-5 years
Azure Data Factory (ADF): Candidate should know how to create data pipelines, schedule activities, and manage data movement and transformation using ADF
Azure Databricks: Candidate should be adept at using Databricks for data engineering tasks like data ingestion, transformation, and analysis
Oracle Database- Candidate should be proficient in using it for data storage, retrieval, and basic manipulation.
Very strong in SQL programming
Programming in Python / PySpark
Azure DevOps: Knowledge of Azure DevOps is valuable for implementing Continuous Integration and Continuous Deployment (CI/CD) pipelines for data engineering solutions
Monitoring and Optimization: Understanding how to monitor the performance of data engineering solutions and optimize them for better efficiency is crucial
Data Quality and Data Cleaning: Knowing how to ensure data quality and perform data cleaning operations to maintain reliable data is important for data engineers.
Data Modeling and ETL/ELT: You should be skilled in data modeling techniques and Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) processes for data integration.
Good to have Apache Hive: Knowledge of Hive, a data warehousing and SQL-like query language that provides an abstraction over Hadoop MapReduce
Good to have Apache Spark: Understanding of Spark and its various component