Spotify’s Event Delivery system is responsible for delivering hundreds of billions of events every day. Most of the events are generated as a response to a user action, such as playing a song, following an artist or clicking on an ad. All in all, more than 300 different types of events are being collected from […]
Spotify began using Docker with a few prototype services in 2014. We upgraded and configured it many times since and have almost every time come across issues that were often hard to detect and fix. When the number of backend services running on Docker was low, the impact of these issues was small. As Docker adoption grew so did the risk and impact of faulty Docker changes until they reached unacceptable levels. In October 2016, we deployed a bad configuration change that significantly affected the user experience.
At this point we went back to the drawing board and realized we needed a new solution that deployed fleet-wide infrastructure changes gradually and with more control. This is a story of how operating Docker at Spotify inspired us to build a service that gave us more control over the rollout of infrastructure changes on thousands of servers.