Autoscaling Pub/Sub Consumers

Spotify’s Event Delivery system is responsible for delivering hundreds of billions of events every day. Most of the events are generated as a response to a user action, such as playing a song, following an artist or clicking on an ad. All in all, more than 300 different types of events are being collected from […]

Big Data Processing at Spotify: The Road to Scio (Part 2)

In this part we’ll take a closer look at Scio, including basic concepts, its unique features, and concrete use cases here at Spotify. Basic Concepts Scio is a Scala API for Apache Beam and Google Cloud Dataflow. It was designed as a thin wrapper on top of Beam’s Java SDK, while offering an easy way […]

TC4D: Data Quality By Engineers, For Engineers

Changing an engineering culture is one of the biggest challenges for any organization. It requires challenging an existing way of working, and introducing compelling improvements that are adopted by individuals as well as the departments. Tackling tech debt instead of amassing it, using new tooling and infrastructure, or in this case, increasing test coverage. After […]

Big Data Processing at Spotify: The Road to Scio (Part 1)

This is the first part of a 2 part blog series. In this series we will talk about Scio, a Scala API for Apache Beam and Google Cloud Dataflow, and how we built the majority of our new data pipelines on Google Cloud with Scio. Scio > Ecclesiastical Latin IPA: /ˈʃi.o/, [ˈʃiː.o], [ˈʃi.i̯o] > Verb: […]

Stepping Up the Cloud Security Game

TL;DR: securing our Cloud infrastructure is incredibly important. We are now taking another step forward by leveraging open source tools we developed in partnership with Google. Spotify engineering teams are fully embracing the devops culture: to increase development speed every dev team is responsible for their operational pipelines. From a security perspective we are continuously […]

Thinking of State in a World of URLs

An intro to Redux-Location-State Like most of the web community, Spotify has found the combination of React and Redux to be a super powerful tool. It has allowed us to iterate quickly, build powerful components that can be shared across development teams, and onboard new developers much faster than previously. While we generally love what […]

Improving Critical Infrastructure Rollouts

Spotify began using Docker with a few prototype services in 2014. We upgraded and configured it many times since and have almost every time come across issues that were often hard to detect and fix. When the number of backend services running on Docker was low, the impact of these issues was small. As Docker adoption grew so did the risk and impact of faulty Docker changes until they reached unacceptable levels. In October 2016, we deployed a bad configuration change that significantly affected the user experience.

At this point we went back to the drawing board and realized we needed a new solution that deployed fleet-wide infrastructure changes gradually and with more control. This is a story of how operating Docker at Spotify inspired us to build a service that gave us more control over the rollout of infrastructure changes on thousands of servers.

Meet our engineers – Charlie Pastuszenski

What’s your name and where are you from? My name is Charlie and I come from the US and grew up in Massachusetts. Before moving to Stockholm, I lived and worked for Spotify in New York City. What do you do at Spotify? I am a machine learning engineer. I spend my time writing data […]