Description

Skills And Key Responsibilities:

  • Design and implement scalable, fault-tolerant streaming data platforms using Apache Kafka and Apache Flink.
  • Lead architectural decisions and establish best practices for real-time data processing and delivery.
  • Develop and maintain self-service infrastructure patterns and tools to enable internal teams to consume, process, and produce streaming data efficiently.
  • Optimize performance, reliability, and observability within a Kubernetes-based environment.
  • Drive infrastructure-as-code practices and automate deployment workflows using tools such as Terraform, Helm, and CI/CD pipelines.
  • Collaborate with data and engineering teams to support analytics, machine learning, and operational use cases.
  • Champion platform reliability, scalability, and cost-efficiency across public cloud platforms (AWS, GCP, or Azure).
  • Mentor junior engineers and contribute to shaping the platform’s technical roadmap.

Required Qualifications:

  • 7+ years of backend/platform engineering experience, with a strong focus on distributed systems.
  • Deep expertise in Apache Kafka (Kafka Streams, Connect) and Apache Flink (DataStream API, state management, CEP, etc.).
  • Hands-on experience running and managing workloads in Kubernetes.
  • Solid experience with cloud-native technologies and services in AWS, Google Cloud, or Azure.
  • Strong programming skills in Java, Scala, or Python.
  • Proficiency with observability tools such as Prometheus, Grafana, and Open Telemetry, and strong debugging skills for distributed systems.
  • Familiarity with infrastructure-as-code tools like Terraform, Pulumi, or similar.
  • Excellent communication skills with the ability to drive technical initiatives across teams

Education

Any Graduate