Job Description
The ideal candidate will have a history of successfully implementing and using tools like Terraform, Packer,
Splunk, SignalFx, and other observability/IAC tools supporting systems with around the clock availability
requirements. In addition, the ideal candidate will possess sufficient software skills to properly scrutinize and
troubleshoot applications supporting our customers. They should have a strong aptitude for learning new
technologies, embracing and driving solutions to challenging projects and problems. This role requires a
seasoned engineer with the ability to collaborate across multiple cross-functional teams while exhibiting a rich
set of problem-solving skills, along with being self-motivated and have a passion for quality!
Responsibilities:
● Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health
and performance.
● Proactively gather and analyze both metric and log data from systems and applications to perform
anomaly detection, performance tuning, capacity planning and fault isolation.
● Collaborate with development teams to implement and deploy new features and enhancements, ensuring
they meet reliability, security and performance standards.
● Partner closely with other teams on enterprise standards/best practices.
● Identify options for problem resolution and initiate corrective actions.
● Mentor junior members, document and share solutions.
● Collaborate cross functionally.
Qualifications:
● Minimum 4 years’ experience in any combination of software engineering roles of some type: SRE,
DevOps, applications, services, tools/automation, release, etc.
● Minimum 3 years’ experience with SRE/DevOps practices and automation tooling
● Experience with observability solutions tools like Splunk, Datadog, SignalFx, etc.
● Experience deploying, maintaining and supporting software applications/services in the AWS ecosystem
● Proactive approach to identifying problems and solutions
● Experience writing code with one or more interpreted languages such as: Python, PHP, Perl, Ruby,
Linux Shell
● Experience with Terraform or Cloud Formation scripting
● Experience with configuration management tools like Ansible, Chef or Puppet
● Experience with standard software development best practices and tools such as code repositories (Git
preferred)
● Experience executing in an agile software development environment
● Good understanding of pricing/cost models across AWS services, especially compute, storage, and
database offerings
● Must be able to multitask and work well with changing priorities in a fast paced, 24x7 environment
● Must be highly collaborative and be able to work in a team environment consisting of both technical and
business people
● Excellent communication, problem solving and customer service skills
● A strong ability to learn and adapt to new technologies
● Education: Bachelor’s degree in computer science, science, engineering or workforce equivalent
technical certifications preferred
Bachelor’s degree in computer science