Description

Must have key skills -

 

  • ETL Tools (IICS, Talend, Hadoop etc.)
  • Azure, with basic understanding of GCP & AWS,
  • JIRA Confluence,
  • MS Visio (SQL / PLSQL / Oracle),
  • PySpark,
  • Shell script and Power BI

     

Data Cleaning Tools & Libraries: Proficiency with tools and libraries to clean and pre process data

for example:

 

  • Python
  • SQL
  • Excel- emphasis on Familiarity with data cleaning functions, filters, and pivot tables. 

     

Good to have skills -

 

  • Knowledge of R
  • Data Management & Analysis Skills
  • Data Validation & Consistency: Ability to identify data quality issues such as duplicates,

     

missing values, outliers, and inconsistencies.

 

  • Data Transformation: Experience in transforming raw data into usable formats, including

     

reshaping, aggregating, or normalizing data.

 

  • Handling Missing Data: Familiarity with imputation techniques or ways to deal with incomplete

     

datasets.

 

  • Data Normalization & Standardization: Ensuring uniformity in data formats, units of

     

measurement, and naming conventions.

 

  • Data Aggregation: Summarizing or grouping data for analysis and ensuring that it is consistent

     

across all sources.

 

  • Knowledge of Data Quality
  • Data Integrity: Understanding the importance of maintaining accurate and consistent data over time.
  • Data Profiling: Identifying patterns, anomalies, and key characteristics of the dataset.
  • Error Detection: Ability to find and correct errors within datasets by checking for outliers,

     

misclassifications, or missing values.

 

  • Soft Skills
  • Attention to Detail: The ability to identify small inconsistencies and issues within large datasets.
  • Problem-Solving: Being resourceful in resolving data issues and proposing solutions.
  • Critical Thinking: Analyzing data in-depth and understanding its implications.
  • Communication: Ability to explain data issues and cleaning steps to non-technical stakeholders.
  • Experience with Data Formats
  • Structured Data: Familiarity with both structured (tables, databases)
  • Data Sources: Ability to clean data from various sources such as spreadsheets, databases, APIs, logs

Education

Any Gradute