Description

What you’ll do:

  • We seek Software Engineers with experience building and scaling services in on-premises and cloud environments.
  • As a Principal Software Engineer in the Epsilon Attribution/Forecasting Product Development team, you will design, implement, and optimize data processing solutions using Scala, Spark, and Hadoop.
  • Collaborate with cross-functional teams to deploy big data solutions on our on-premises and cloud infrastructure along with building, scheduling and maintaining workflows.
  • Perform data integration and transformation, troubleshoot issues, Document processes, communicate technical concepts clearly, and continuously enhance our attribution engine/forecasting engine.
  • Strong written and verbal communication skills (in English) are required to facilitate work across multiple countries and time zones. Good understanding of Agile Methodologies – SCRUM.

Qualifications

  • Strong experience (5 -8 years) in Scala programming language and extensive experience with Apache Spark for Big Data processing for design, developing and maintaining scalable on-prem and cloud environments, especially on AWS and as needed with GCP cloud.
  • Proficiency in performance tuning of Spark jobs, optimizing resource usage, shuffling, partitioning, and caching for maximum efficiency in Big Data environments.
  • In-depth understanding of the Hadoop ecosystem, including HDFS, YARN, and MapReduce.
  • Expertise in designing and implementing scalable, fault-tolerant data pipelines with end-to-end monitoring and alerting.
  • Using Python to develop infrastructure modules. Hence, hands-on experience with Python.
  • Solid grasp of database systems and SQLs for writing efficient SQL’s (RDBMS/Warehouse) to handle TBS of data.
  • Familiarity with design patterns and best practices for efficient data modelling, partitioning strategies, and sharding for distributed systems and experience in building, scheduling and maintaining DAG workflows.
  • End-to-end ownership with definition, development, and documentation of software’s objectives, business requirements, deliverables, and specifications in collaboration with stakeholders. 
  • Experience in working on GIT (or equivalent source control) and solid understanding of Unit and integration test frameworks. 
  • Must have the ability to collaborate with stakeholders/teams to understand requirements and develop a working solution and the ability to work within tight deadlines and effectively prioritize and execute tasks in a high-pressure environment. 
  • Must be able to mentor junior staff.

Advantageous to have experience on below: 

  • Hands-on with Databricks for unified data analytics, including Databricks Notebooks, Delta Lake, and Catalogues.
  • Proficiency in using the ELK (Elasticsearch, Logstash, Kibana) stack for real-time search, log analysis, and visualization.
  • Strong background in analytics, including the ability to derive actionable insights from large datasets and support data-driven decision-making.
  • Experience with data visualization tools like Tableau, Power BI, or Grafana.
  • Familiarity with Docker for containerization and Kubernetes for orchestration.

Education

Any Graduate