Commoditizing Music Machine Learning : Services

Five years ago, music personalization at Spotify was a tiny team. The team read papers, developed models, wrote data pipelines and built services. Today personalization involves multiple teams in New York, Boston & Stockholm producing datasets, feature engineering and serving up products to users. Features like Discover Weekly and Release Radar are but the tip of […]


Personalization at Spotify using Cassandra

  By Matt Brown and Kinshuk Mishra At Spotify we have have over 60 million active users who have access to a vast music catalog of over 30 million songs. Our users have a choice to follow thousands of artists and hundreds of their friends and create their own music graph. On our service they also […]


How Spotify Scales Apache Storm

Spotify has built several real-time pipelines using Apache Storm for use cases like ad targeting, music recommendation, and data visualization. Each of these real-time pipelines have Apache Storm wired to different systems like Kafka, Cassandra, Zookeeper, and other sources and sinks. Building applications for over 50 million active users globally requires perpetual thinking about scalability […]


Data Processing with Apache Crunch at Spotify

All of our lovely Spotify users generate many terabytes of data every day. All the songs that are listened to, all the playlists you make, all the people you follow, and all the music you share. Somehow we need to organise, process and aggregate all of this into meaningful information out the other side. Here […]