Key Skills: GCP, Kafka, Python, Kubernetes, Terraform
Roles and Responsibilities:
- Design, build, and maintain a scalable and reliable Kafka solution using KRaft, Strimzi, Zookeeper, etc.
- Implement integration with KRaft and Strimzi for coordination and leader election.
- Automate the entire lifecycle of platform components from provisioning through decommissioning.
- You'll help solve problems related to building a global Kafka architecture considering scalability, replication, schema registries and self-service IaC workflows to solve for different use cases such as high traffic telemetry (logs & metrics), business critical 0078events and data processing.
- Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost.
- Build Cloud native CI/CD pipelines, tools and automation that enables developer autonomy and improves their productivity.
- Build tools that predict saturations/failures and take preventive actions through automation.
- Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/training and improve the adoption of the platforms/tools.
- Provide technical leadership for your peers and help them deliver the best solutions to the rest of company.
- As an expert in your area, you will help set the tone for how your team operates. You'll emphasize modern, rigorous software development practices that emphasize testability, repeatability, and self-service automation.
- You'll conduct code reviews and mentor more junior developers. You'll openly collaborate with other teams' leads and help raise the bar of engineering excellence across the entire organization.
Skills Required:
- 10+ years of experience in building large scale distributed systems in an always available production environment.
- 5+ years of experience in building Infrastructure Platforms and CI/CD pipelines in a major public cloud provider - GCP preferred.
- Strong hands on designing integration solutions using Kafka and aware of integration best practices for data streaming solutions.
- Strong skills in In-memory applications, Data Integration.
- Strong awareness of Kafka ecosystem, KRaft, Strimzi, Zookeeper, Kafka cluster, Kafka broker, producer, consumer, Connectors, different APIs, Kafka Topic etc.
- Strong hands on SQL connector, HTTP connector etc.
- Strong knowledge on different APIs exposed by Kafka to handle the integration, data streaming and for data management.
- Strong awareness on designing deployment architecture for Kafka solutions.
- Solid hands on Kafka deployment and configuration.
- Experience on production deployment, invoke Kafka components as background processes, configuration, troubleshooting and environment maintenance.
- In-depth knowledge of Kubernetes and its ecosystem - Containerd/Docker, Helm/Plulumi, ServiceMesh, Terraform/Terragrunt, Ansible, ArgoCD/Workflow, Tekton, etc.
- Hands-on experience with observability platforms/tools like ELK/Fluentd/Fluentbit, Grafana/Prometheus/OpenTelemetry, Cortex/InfluxDB etc.
- Experience in Python, Golang or a similar language.
- Proven experience in leading major initiatives from requirements, design, and implementation to ongoing lifecycle management.
- Experience in coaching and mentoring junior engineers; strong verbal and written communications.
- Excellent problem-solving skills, with the ability to troubleshoot complex issues and implement effective solutions.
- Strong communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
Education: Bachelor's degree in Computer Science, Engineering, or related field; or equivalent work experience