Description

Job Description:

 In-depth knowledge of Apache Spark, Spark APIs [Spark SQL and Data Frame APIs, Spark Structured Stream Knowledge of Flink, streaming and batching modes, caching and optimizing performance. 

· Design and develop analytics workloads using Apache Spark and Scala for processing of big data 

· Create and optimize data transformation pipelines using Spark or Apache Flink 

· Proficiency in performance tuning and optimization of Spark jobs 

· Experience on migrating existing analytics workloads from cloud platforms to open-source Apache Spark 

· Expertise in data modeling and optimization techniques for large-scale datasets 

· Extensive experience with real spark production instance. 

· Strong understanding of Data Lake, Big Data, ETL processes, and data warehousing concepts Coed understanding of lakeber

· infrastructure running on Kubern Expertise in data modeling and optimization techniques for large-scale datasets 

· Good understanding of lakehouse storage technologies like Delta Lake and Apache Iceberg AWS knowledge

Other skills 

· Technical Leadership: Lead and mentor a team of data engineers, analysts, and architects. Provide guidance on best practices, architecture 

· Collaboration: Work closely with cross-functional teams including data scientists, business analysts, and developers to ensure seamless i 

· Communication: Excellent verbal and written communication skills with the ability to convey complex technical concepts to non-technical

 

Education

Any Graduate