Needs:
Hadoop
AWS
Spark
PySpark
Java
Python
Scala
Key Skills:
- Big Data Technologies: Hadoop, Spark, HDFS, Hive, Cloudera, Hortonworks
- Cloud Platforms: AWS (Glue, Lambda, Redshift, S3, CloudWatch)
- ETL/ELT Tools: AWS Glue, Python, PySpark, Databricks
- Programming Languages: Python, Java, Scala, SQL, HiveQL
- Data Integration & Migration: Experience with Hadoop, Kafka, data lakes, and real-time streaming
- Data Modeling & Transformation: Dimensional data models, structured/unstructured data processing
- CI/CD & Automation: Jenkins, Git, Autosys, Airflow