Description

  • Develop and maintain reliable and scalable systems using Python, C, and C++ to support site reliability engineering tasks.
  • Implement DevOps practices to enhance system performance and reliability.
  • Collaborate with cross-functional teams to design and deploy continuous integration and continuous deployment (CI/CD) pipelines.
  • Utilize AWS services for infrastructure management and optimization.
  • Deploy and manage Kubernetes clusters to ensure efficient container orchestration.
  • Monitor system performance, troubleshoot issues, and implement solutions to maintain high availability.
  • Automate operational tasks and processes to streamline workflows and improve efficiency.
  • Participate in incident response and resolution to minimize downtime and impact on services.
  • Contribute to the evolution of best practices in site reliability engineering and DevOps methodologies.
  • Mentor junior team members and provide guidance on technical aspects of SRE and DevOps practices

Education

Bachelor's or Master's degrees