Description

Skills Required:
Kafka, Spark Streaming. Proficiency in one of the programming languages preferably Java, Scala or Python.


Education/Qualification:
Bachelor's Degree in Computer Science, Engineering, Technology or related field

Desirable Skills:
Kafka, Spark Streaming. Proficiency in one of the programming languages preferably Java, Scala or Python.

Years Of Exp:
Above 1 Years

About the Role:
Data engineers to build data analytical solutions that will address increasingly complex business questions. A Data Engineer will be a hands-on person responsible for designing, prototyping and implementing data products that support a wide variety of data processing, data science and analytics needs. Data Engineers work closely with data scientists, product managers, data platform team to understand the functional data requirements and leverage the underlying tech stack to come up with scalable, robust data applications which can crunch terabytes of data in real-time. Data products and applications you build will enable data driven decision making across business, analytics and operations.


About the team:
What is team all about ( FDP- Flipkart Data Platform)
● The goal of FDP team is to democratize data access, processing & intelligence
● FDP team is building Internet scale, multi-tenant data platform as a cloud
service.
● Enable teams to focus on building data applications instead of building &
managing data infra
● FDP ingests and processes Terabytes of data every day and works with latest big
data technology stack like Hadoop 2.0, Storm, Spark, Cassandra
● FDP team has strong relationship with open source community and has many
open source committers in the team


You are Responsible for:
 

● You should have good hands-on experience in designing, implementing, and operating stable, scalable, solutions to flow data from production systems into analytical data platform (big data tech stack + MPP) and into end-user facing applications for both real-time and batch use cases

● You should be able to work with business customers in a fast paced environment understanding the business requirements and implementing analytical solutions.

● You should have good experience in the design, creation, management, and business use of large datasets. Do high level design with guidance; Functional modelling, break-down of a module. Thinking platforms & reuse

● Build and execute data modeling projects across multiple tech stacks i.e. big data, MPP, OLAP using agile development techniques. Build and integrate robust data processing pipelines for enterprise-level business analytics.

● Challenge status quo and propose innovative ways to process, model, consume data when it comes to tech stack choices or design principles. As needed, assist other staff with reporting, debugging data accuracy issues and other related functions.

● Strong engineering mindset - build automated monitoring, alerting, self healing (restartability/graceful failures) features while building the consumption pipelines. Translate business requirements into technical specification (fact/dimension/filters/derivations/aggregations)

● An ideal candidate will have excellent communication skills to be able to work with engineering, product and business owners to develop and define key business questions and to build data sets that answer those questions. you should bring your passion for working with huge data sets and bringing datasets together to answer business questions and drive change


To succeed in this role – you should have the following:
3-5 years’ experience with a Bachelor's Degree in Computer Science, Engineering, Technology or related field required. 2 to 3 years of relevant software development experience with sound skills in database modeling (relational, multi-dimensional) & optimization and data architecture - databases e.g. Vertica

● Good understanding of streaming technologies like Kafka, Spark Streaming. Proficiency in one of the programming languages preferably Java, Scala or Python. Good knowledge of Agile, SDLC/CICD practices and tools with good understanding of distributed systems

● Experience with Enterprise Business Intelligence Platform/Data platform sizing, tuning, optimization and system landscape integration in large-scale, enterprise deployments.

● Must have proven experience with Hadoop, Mapreduce, Hive, Spark, Scala programming. Must have in-depth knowledge of performance tuning/optimizing data processing jobs, debugging time consuming jobs.

● Proven experience in development of conceptual, logical, and physical data models for Hadoop, relational, EDW (enterprise data warehouse) and OLAP database solutions.

● Experience working extensively in multi-petabyte DW environments. Experience in engineering large-scale systems in a product environment

 

Education

Any Graduate