The Winding Road to Better Machine Learning Infrastructure Through Tensorflow Extended and Kubeflow

When Spotify launched in 2008 in Sweden, and in 2011 in the United States, people were amazed that they could access almost the world’s entire music catalog instantaneously. The experience felt like magic and as a result, music aficionados dug in and organized that content into millions of unique playlists. Early on, our users relied […]


Spotify’s Event Delivery – Life in the Cloud

Spotify is a data informed company and in such a company Event Delivery is a key component. Every event containing data about users, the actions they take, or operational logs from hundreds of systems is a valuable piece of information. Without a successful Event Delivery system, we would not be able to understand our users […]


Scio 0.7: a deep dive

Introduction Large-scale data processing is a critical component of Spotify’s business model. It drives music recommendations, artist payouts based on stream counts, and insights about how users interact with Spotify. Every day we capture hundreds of terabytes of event data, in addition to database snapshots and derived datasets. It’s imperative that engineers who want to […]


Autoscaling Pub/Sub Consumers

Spotify’s Event Delivery system is responsible for delivering hundreds of billions of events every day. Most of the events are generated as a response to a user action, such as playing a song, following an artist or clicking on an ad. All in all, more than 300 different types of events are being collected from […]


Spotify’s Love/Hate Relationship with DNS

Spotify has a history of loving “boring” technologies. It’s not that often people talk about DNS; when they do, it’s usually to complain. But it’s because DNS is boring that we love it so much: this post will walk through how we’ve designed & manage our own DNS infrastructure, and pushing it to its limits.