Lead Software Engineer- Python hadoop

Talent21
Bengaluru, Karnataka, India

Description

What you’ll do:

We seek Software Engineers with experience building and scaling services in on-premises and cloud environments.
As a Principal Software Engineer in the Epsilon Attribution/Forecasting Product Development team, you will design, implement, and optimize data processing solutions using Scala, Spark, and Hadoop.
Collaborate with cross-functional teams to deploy big data solutions on our on-premises and cloud infrastructure along with building, scheduling and maintaining workflows.
Perform data integration and transformation, troubleshoot issues, Document processes, communicate technical concepts clearly, and continuously enhance our attribution engine/forecasting engine.
Strong written and verbal communication skills (in English) are required to facilitate work across multiple countries and time zones. Good understanding of Agile Methodologies – SCRUM.

Qualifications

Strong experience (5 -8 years) in Scala programming language and extensive experience with Apache Spark for Big Data processing for design, developing and maintaining scalable on-prem and cloud environments, especially on AWS and as needed with GCP cloud.
Proficiency in performance tuning of Spark jobs, optimizing resource usage, shuffling, partitioning, and caching for maximum efficiency in Big Data environments.
In-depth understanding of the Hadoop ecosystem, including HDFS, YARN, and MapReduce.
Expertise in designing and implementing scalable, fault-tolerant data pipelines with end-to-end monitoring and alerting.
Using Python to develop infrastructure modules. Hence, hands-on experience with Python.
Solid grasp of database systems and SQLs for writing efficient SQL’s (RDBMS/Warehouse) to handle TBS of data.
Familiarity with design patterns and best practices for efficient data modelling, partitioning strategies, and sharding for distributed systems and experience in building, scheduling and maintaining DAG workflows.
End-to-end ownership with definition, development, and documentation of software’s objectives, business requirements, deliverables, and specifications in collaboration with stakeholders.
Experience in working on GIT (or equivalent source control) and solid understanding of Unit and integration test frameworks.
Must have the ability to collaborate with stakeholders/teams to understand requirements and develop a working solution and the ability to work within tight deadlines and effectively prioritize and execute tasks in a high-pressure environment.
Must be able to mentor junior staff.

Advantageous to have experience on below:

Hands-on with Databricks for unified data analytics, including Databricks Notebooks, Delta Lake, and Catalogues.
Proficiency in using the ELK (Elasticsearch, Logstash, Kibana) stack for real-time search, log analysis, and visualization.
Strong background in analytics, including the ability to derive actionable insights from large datasets and support data-driven decision-making.
Experience with data visualization tools like Tableau, Power BI, or Grafana.
Familiarity with Docker for containerization and Kubernetes for orchestration.

Key Skills

Scala Programming Apache Spark Hadoop Ecosystem Sql Proficiency

Education

Any Graduate

Apply Now

Back To Jobs

Posted On: 30+ Days Ago
Experience: 9+ years of experience
Openings: 2
Category: Lead Software Engineer
Tenure: Flexible Position