Description

Job Description:

  • Build reliable, scalable, and secure backend platform services using Java (preferred) or Python, focused on internal developer workflows (e.g., CI/CD, provisioning, bootstrapping).
  • Build core Platform Components of our Internal Developer Platform (IDP), driving self-service capabilities and reducing toil across engineering teams.
  • Design and build Cloud Infrastructure using Terraform and modern AWS services (e.g., EKS, IAM, Lambda, EC2, S3), following infrastructure-as-code and GitOps best practices.
  • Embed Reliability & Observability by integrating monitoring, tracing, and alerting tools (e.g., CloudWatch, Datadog, ELK) into platform foundations.
  • Contribute to writing technical designs, reviewing PRs.
  • Drive Operational Excellence through proactive incident response, on-call participation, and postmortem leadership to improve platform resilience.
  • Collaborate Cross-Functionally with product, security, and operation teams to prioritize platform capabilities that accelerate innovation and reduce cognitive load for developers.
  • Evaluate, upgrade, and implement new tools, frameworks, and language versions (e.g., Java).
  • Assist with creating programs for training and onboarding for new customers.
  • Lead Agile/Kanban workflows and team process work.
  • Troubleshoot customer issues as needed, track all details in JIRA and PagerDuty on-call system


Required Skills:

  • 10+ years of software engineering experience, with:
    • 6+ years in backend development with Java (PREFRRED) or Python.
    • 6+ years working with AWS and building cloud-native infrastructure.
    • 3+ years deploying and operating Kubernetes workloads in EKS or similar environments.
  • Deep experience in:
    • DevOps & Automation: Building CI/CD pipelines using Terraform, GitLab Runner, or Jenkins.
    • Security: Implementing role-based access, IAM policies, AD integration, and using security tools for vulnerability scanning and remediation.
    • Monitoring & Observability using CloudWatch, Datadog, and Elastic Search.
  • Expertise in distributed systems, networking, and systems reliability principles.
  • Track record of technical ownership in complex engineering environments.
  • Comfortable navigating ambiguity and rapidly evolving requirements with a strong bias for action

Education

Any Graduate