OneSignal has grown rapidly to where we are today serving billions of HTTP requests daily & sending upwards of over 8 billion messages daily. We achieved this scale by leveraging bare metal cloud & writing scale sensitive components in languages like Rust & Go. This potent combination of high performance, low cost hardware with efficient resource utilization has given us an incredible competitive edge.
We are hiring SREs to help us continue to scale by operating & engineering the future of our infrastructure. We are maintaining 99.95% uptime today, & we are investing to ensure we maintain that as then business continues to grow & as the product evolves.
Your primary task will be software engineering with a focus on infrastructure, operations, & automation. You'll be building systems to run our product, improving internal services, & advising product teams on architecture as it relates to the operability of the service.
The systems you'll be responsible include all of the services which power our product. This ranges from off-the-shelf services like haproxy, nginx, Redis, PostgreSQL, Kafka, & etc. to our in-house services such as the Rails web app, various Rust backend services, & our high performance API layer written in Go.
You'll be working with Kubernetes to automate our datacenter operations & writing operational services to automate database operations. One of the key challenges in this role is to not only understand systems to the point of being able to manually operate by hand, but also to understand in sufficient detail to write software systems to automate such operations.
For some additional context on how we think about SRE, please see the introductory chapter
of the Google SRE book.