Description

JD

Technical Skills

•            In-depth knowledge of Apache Spark, Spark APIs [Spark SQL and DataFrame APIs, Spark Structured Streaming & Spark MLlib for analytics] and Kafka, can code in Scala /Java.

•            Knowledge of Flink,  streaming and batching modes, caching and optimizing performance.

•            Design and develop analytics workloads using Apache Spark and Scala for processing of big data

•            Create and optimize data transformation pipelines using Spark or Apache Flink

•            Proficiency in performance tuning and optimization of Spark jobs

•            Experience on migrating existing analytics workloads from cloud platforms to open-source Apache Spark infrastructure running on Kubernetes.

•            Expertise in data modeling and optimization techniques for large-scale datasets

•            Extensive experience with real spark production instance.

•            Strong understanding of Data Lake, Big Data, ETL processes, and data warehousing concepts

•            Good understanding of lakehouse storage technologies like Delta Lake and Apache Iceberg

•            AWS knowledge

Other skills

•            Technical Leadership: Lead and mentor a team of data engineers, analysts, and architects. Provide guidance on best practices, architectural decisions.

•            Collaboration: Work closely with cross-functional teams including data scientists, business analysts, and developers to ensure seamless integration of data solutions.

•            Communication: Excellent verbal and written communication skills with the ability to convey complex technical concepts to non-technical stakeholders.

 

Skills:

Spark,Scala,Java,data modeling,Data Lake,Big Data,ETL

Education

Any Graduate