Site Reliability Engineer

Website Shopify

Job Description:

The Resiliency team is part of the Production Engineering organization that builds, operates, and improves the heart of Shopify’s technical platform, and unlock the power of planet-scale infrastructure for all of Shopify’s merchants, buyers, and developers.

Job Responsibilities:

  • Responding to automated alerts and execute playbooks.
  • Managing ongoing incidents, using your understanding of Shopify to involve the right teams and resolve as quickly as possible.
  • Cleaning up the noise in our signals, ensuring we can get an understanding of the system and debug a problem easily.
  • Acting as a force multiplier across and within engineering departments.
  • Collaborating with high-calibre engineering teams across Shopify to help them create resilient systems.
  • Ensuring we never fail for the same reason twice.
  • Helping teams build tools to automate the toil of on-call duties.
  • Following up on each meaningful incident to ensure the appropriate learnings are extracted and teams know what to do next.
  • Setting standards with teams for building resilient, debuggable systems.

Job Requirements:

  • Comfort with hands-on development, navigating through multiple programming languages (Java, Python, Go, etc), digging deep in the stack, and using cloud infrastructure (AWS, GCE, Azure, Kubernetes, Docker).
  • You understand the meaning of continuous improvement and evolving systems.
  • Strong software engineering skills, primarily in backend software development.
  • Experience working with a variety of open-source software, including nginx, redis, Memcached and MySQL.
  • Familiarity with network and web protocols, from IP to HTTP.
  • A commitment and drive for quality, technical excellence and results.
  • You reject the idea that on call has to be a terrible, disruptive experience.
  • Experience handling multiple on-call shifts for mission-critical systems, and responsibility for the tools and processes used to debug and correct failures.
  • You understand how to improve difficult situations through short and iterative projects.
  • You know what good observability looks like, but more importantly, how to get there.
  • Experience with mentorship and helping teammates level up their craft and technical skills.
  • You’ve navigated more than one incident through to the retrospective process.

Job Details:

Company: Shopify

Vacancy Type: Full Time

Job Location: Cambridge, Ontario, CA

Application Deadline: N/A

Apply Here