Improving Critical Infrastructure Rollouts

Spotify began using Docker with a few prototype services in 2014. We upgraded and configured it many times since and have almost every time come across issues that were often hard to detect and fix. When the number of backend services running on Docker was low, the impact of these issues was small. As Docker adoption grew so did the risk and impact of faulty Docker changes until they reached unacceptable levels. In October 2016, we deployed a bad configuration change that significantly affected the user experience.

At this point we went back to the drawing board and realized we needed a new solution that deployed fleet-wide infrastructure changes gradually and with more control. This is a story of how operating Docker at Spotify inspired us to build a service that gave us more control over the rollout of infrastructure changes on thousands of servers.


Spotify’s Love/Hate Relationship with DNS

Spotify has a history of loving “boring” technologies. It’s not that often people talk about DNS; when they do, it’s usually to complain. But it’s because DNS is boring that we love it so much: this post will walk through how we’ve designed & manage our own DNS infrastructure, and pushing it to its limits.