Jacob Kaplan-Moss

"Web Scale"

I wrote this post in 2010, more than 12 years ago. It may be very out of date, partially or totally incorrect. I may even no longer agree with this, or might approach things differently if I wrote this post today. I rarely edit posts after writing them, but if I have there'll be a note at the bottom about what I changed and why. If something in this post is actively harmful or dangerous please get in touch and I'll fix it.

Christophe Pettus:

What does [“web scale”] mean?

It clearly means something along the lines of, “Can handle lots of transactions per unit time,” but how many?

I mean, WordPress with WP-SuperCache is “web scale” if all that is meant is, “Can be used to implement a high volume site,” but I assume those who are touting something as “web scale” are aiming higher than that.

Anyone care to offer a quantitative definition of this term?

Since I tend to use “web scale” to describe the types of problems we try to tackle at Revsys I figure I should try to take a stab at answering Christophe’s question.

Like nearly everything about our industry, there’s sadly a bit of hype in the term “web scale.” It seems that many like to use the term as a fancy synonym for “big,” and I think it’s that sloppy hyperbolic use that Christophe (and I) object to. It’s easy to say “ooh, we do a lot of traffic – we’re Web Scale!”

But I think this ignores a fundamental difference between traffic patterns on modern web sites and other sorts of traffic. Most successful web sites – and certainly the ones I’d call “web scale” – have a strong social aspect to them. These sites aren’t straightforward read/write operations, but instead exhibit strong network effects.

Why does that matter?

Two words: Reed’s law. Reed’s law states that “the utility of large networks, particularly social networks, can scale exponentially with the size of the network.” The proof’s pretty simple: if you have N people in a network, you have 2N possible groups (i.e. subnetworks).

Now, Reed’s law talks about utility – value, roughly – but it isn’t hard to see that traffic across social networks follows similar laws. Users use networks more as they make more connections, and as the network size grows there are more possible connections to make. Each new user adds a lot more than a linear increase to traffic and resource use. Sure, it’s not literally 2N growth, but the point is that sites exhibiting network effects see traffic grow at rate far beyond linear.

To me, then, “web scale” describes the tendency of modern sites – especially social ones – to grow at (far-)greater-than-linear rates. Tools that claim to be “web scale” are (I hope) claiming to handle rapid growth efficiently and not have bottlenecks that require rearchitecting at critical moments.

The implication for “web scale” operations engineers is that we have to understand this non-linear network effect. Network effects have a profound effect on architecture, tool choice, system design, and especially capacity planning.

So I think “web scale,” despite the hype and hyperbole, is an important concept.