Cassandra: Data-Driven Configuration

Spotify currently runs over 100 production-level Cassandra clusters. We use Cassandra across user-facing features, in our internal monitoring and analytics stack, paired with Storm for real-time processing, you name it. With scale come questions. “If I change my consistency level from ONE to QUORUM, how much performance am I sacrificing? What about a change to […]


Personalization at Spotify using Cassandra

  By Matt Brown and Kinshuk Mishra At Spotify we have have over 60 million active users who have access to a vast music catalog of over 30 million songs. Our users have a choice to follow thousands of artists and hundreds of their friends and create their own music graph. On our service they also […]


Date-Tiered Compaction in Apache Cassandra

For my master’s thesis, I developed and benchmarked an Apache Cassandra compaction strategy optimized for time series. The result, the Date-Tiered Compaction Strategy (DTCS), has recently been included in upstream Cassandra. We now use it in production at Spotify. Marcus Eriksson has written another blog post about this feature on the DataStax Developer Blog. What […]


Backend infrastructure at Spotify

In this blog post I will give an overview of how we are building our backend infrastructure at Spotify. Our backend infrastructure is very much work in progress – in some areas we have come a long way and in others we have just started. In order to understand why we are building this infrastructure […]


In praise of “boring” technology

In this article I will explain how Spotify uses different mature and proven technologies in our backend service eco-system and architecture, and why we do so. In addition, this article will also attempt to explain when Spotify has chosen not to use certain proven technologies, the reasons why and the associated pitfalls associated with each. […]