Scio 0.7: a deep dive

Introduction Large-scale data processing is a critical component of Spotify’s business model. It drives music recommendations, artist payouts based on stream counts, and insights about how users interact with Spotify. Every day we capture hundreds of terabytes of event data, in addition to database snapshots and derived datasets. It’s imperative that engineers who want to […]


(Right to Left (The Mirror World

Localization at Spotify is a big deal. Our mission is to “unlock the potential of human creativity—by giving a million creative artists the opportunity to live off their art and billions of fans the opportunity to enjoy and be inspired by it.” To achieve this mission, it’s important to be able to effectively communicate across the various languages that reflects the diversity of our users . Recently Spotify launched in the North Africa and West Asia regions. One of the languages spoken in these regions is  Arabic. Unlike English, which is read from left to right, Arabic is read from right to left. This has implications for websites that want to support Arabic.


An opinionated anthropology of the embedded programmer, its habits and habitat

Illustrations by Jonas Ekman Registers! Oscilloscopes! Beards! Serial Ports! C! Cycle shaving! Beards! Interrupts! Assembly! Did I mention beards? If I were to say the words “embedded programmer” most people in our industry would immediately conjure up an image of a heroic character. A magnificent developer with arcane skills, an encyclopedic knowledge of the occult, […]


Whacking a million moles: Automated Incident Response Infrastructure in GCP

Incident responders want to have as much information as possible to ease the investigation and triage process. Additionally, intrusion detection engineers want to know about forensic artifacts and map server baselines (running processes, storage artifacts on disk) on a large fleet of servers in order to quickly identify anomalies. This is difficult in the context […]


Building Spotify’s New Web Player

The purpose of this post is to tell the story of the new Spotify web player. How and why it came to be. We will focus on what the steps were that led to a complete rewrite and how the lessons learned influenced the experience and the tech decisions of the new web player for […]




Scalable User Privacy

At Spotify, we have a complex and diverse data processing ecosystem. Our backend infrastructure handles millions of requests per second, which are processed by over a thousand (micro)services. Our batch pipeline environment is equally complex and diverse; we run thousands of jobs written in a variety of frameworks such as Scio, BigQuery, Apache Crunch and […]