ELS: a latency-based load balancer, part 2

What to Measure? In part 1, we already mentioned a few metrics that should be considered by the load balancer. Success latency ℓ and success rate s of each machine. Number of outstanding requests q between the load balancer and each machine. These are the requests that have been sent out but haven’t received a […]

ELS: latency based load balancer, part 1

Load Balancing Most Spotify clients connect to our back-end via accesspoint which forwards client requests to other servers. In the picture below, the accesspoint has a choice of sending each metadataproxy request to one of 4 metadataproxy machines on behalf of the end user. The client should get a quick reply from our servers, so if one machine becomes too slow, it […]

Underflow bug

All of us are familiar with overflow bugs. However, sometimes you write code that counts on overflow. This is a story where overflow was supposed to happen but didn’t, hence the name underflow bug. Round-robin In our Java implementation of the round-robin algorithm, we store the number of connections in variable size and then we call index() % size to […]

How to shuffle songs?

At Spotify we take user feedback seriously. We noticed some users complaining about our shuffling algorithm playing a few songs from the same artist right after each other. The users were asking “Why isn’t your shuffling random?”. We responded “Hey! Our shuffling is random!” So who was right? As it turns out, both we and […]