Staff Site Reliability Engineer

Employment Type

: Full-Time


: Engineering

Loading some great jobs for you...

RESPONSIBILITIES: • Support the operability roadmap to improve the availability, performance, scalability and efficiency of services by implementing automation, monitoring, redundancy, and capacity. • Participate in on-call rotation to address and efficiently resolve customer impacting issues • Assist in implementing security mechanisms to prevent, detect and respond to internal and external security threats • Collaborate, effectively share your ideas as well as considering the ideas of others, and generally work well as part of a distributed team. Individuals that prefer to work in silos or in isolation will not do well here. MINIMUM QUALIFICATIONS: • Strong Linux\Windows experience • Fluency with some combination of: Python, Bash, Terraform, C#, .NET, or PowerShell • Experience with AWS and/or Azure • Passionate about all aspects of incident management from detection through post-mortem with primary focus on the customer experience • Experience managing large numbers of diverse systems with configuration management systems like: Salt, Chef, or Puppet • Ability to write, debug, and document code • Strong sense of ownership, customer service, and integrity demonstrated through clear communication. • Ability to work as a part of distributed team. • Bachelors of Science in Computer Science, Networking, or relevant focus or equivalent experience PREFERRED QUALIFICATIONS: • Understanding of service scalability in relation to performance, reliability and cost. • Experience with ELK, Datadog, Splunk or similar • Knowledge of and active participation in the Agile scrum process. Exposure to Security and Privacy Certifications like FedRamp, SOC2, GDDR

Launch your career - Create your profile now!

Create your Profile

Loading some great jobs for you...