- Design, deploy, and maintain Cassandra clusters across development, stage, and production environments.
- Ensure high availability, fault tolerance, and disaster recovery capabilities in Cassandra deployments.
- Collaborate with application teams to understand data access patterns and optimize data models accordingly.
- Monitor and tune performance of Cassandra clusters, compaction strategies, and read/write throughput.
- Develop and implement backup, restore, and archival strategies
- Automate operational tasks using scripting languages (e.g., Python, Bash) and infrastructure-as-code tools (e.g., Ansible, Terraform).
- Participate in on-call rotations and provide production support for critical incidents.
- Define and enforce best practices for schema design, replication strategies, and consistency levels.
- Lead efforts in upgrading Cassandra versions and applying patches with minimal downtime.
- Collaborate with security teams to ensure compliance with internal and external data protection standards.
- Document architecture, configurations, and operational procedures for knowledge sharing and audit readiness.
Required Skills and Experience
- Deep understanding of Cassandra internals, including repair mechanisms.
- Proficiency in NoSQL and experience with Cassandra drivers (Java, Python, etc.).
- Experience with monitoring tools such as Prometheus, Grafana, and DataStax OpsCenter.
- Familiarity with multi-region deployments and hybrid cloud architectures.
- Strong troubleshooting and debugging skills for distributed systems.
- Experience with data migration and integration across heterogeneous systems.
- Excellent communication and collaboration skills to work across global teams and time zones.
Preferred Qualifications
- Bachelor’s degree in Computer Science, Information Systems, or equivalent experience.
- Experience with complementary NoSQL technologies (e.g., Redis, MongoDB) is a plus.
- Exposure to Kubernetes and Cassandra operators (e.g., Cass-Operator) is desirable