Description

Job Description

  • 15+ years of experience in infrastructure , with 5+ years as a Kafka Administrator or in a similar role managing Kafka clusters, GCP Cloud architect.
  • 5+ GCP cloud architect.
  • Proven experience as a Kafka Administrator or in a similar role managing Kafka clusters.
  • Strong understanding of distributed systems, data streaming, and event-driven architectures.
  • Experience with Linux/Unix operating systems and shell scripting.
  • Proficiency in GCP and containerization technologies (Docker, Kubernetes) is a plus.
  • Experience with network security principles and practices.
  • Proficiency in Terraform for infrastructure as code.
  • Experience with GitLab CI/CD pipelines is a plus.
  • Strong customer communication skills to help expand and secure more work with clients.

Job Responsibilities

  • Cluster Management:
  • Install, configure, and maintain Kafka clusters across various environments (development, testing, production).
  • Perform upgrades and patching of Kafka and related components (e.g., Zookeeper).
  • Ensure optimal performance and reliability of Kafka clusters.
  • Monitoring and Troubleshooting:
  • Monitor Kafka cluster health and performance using tools like Prometheus, Grafana, or proprietary monitoring solutions.
  • Diagnose and resolve Kafka brokers, topics, partitions, and consumer issues.
  • Implement proactive measures to prevent potential issues.
  • Security and Compliance:
  • Implement and manage security protocols for Kafka, including SSL/TLS encryption, Kerberos authentication, and access control policies.
  • Ensure compliance with organizational and industry standards for data security and privacy.
  • Apply network security principles to protect Kafka infrastructure.
  • Capacity Planning and Scalability:
  • Perform capacity planning to ensure the Kafka infrastructure can handle current and future workloads.
  • Optimize Kafka configurations for performance and scalability based on application requirements.
  • Backup and Recovery:
  • Develop and maintain disaster recovery plans for Kafka environments.
  • Implement and test backup and restore procedures to ensure data integrity and availability.
  • Collaboration and Support:
  • Work closely with development teams to understand Kafka usage patterns and provide guidance on best practices.
  • Provide support for Kafka-related issues, including on-call support as needed.
  • Document Kafka infrastructure, configurations, and operational procedures.
  • Automation and Scripting:
  • Develop automation scripts for routine tasks such as cluster provisioning, monitoring, and maintenance using tools like Ansible, Puppet, or custom scripts.
  • Implement CI/CD pipelines for Kafka-related deployments and updates.
  • Client Communication and Reporting:
  • Maintain regular communication with clients to provide updates and gather feedback.
  • Prepare and present weekly and monthly status reports to stakeholders.
  • Present proof-of-concept (POC) solutions and designs to clients and internal teams.

Education

Any Graduate