Sharding: In Theory and Practice (Part Five)

  Part Five: The Data Warehouse Welcome to the final installment of this series on database sharding. Following the steps in part two (The Differences Between Algorithmic and Dynamic Sharding) and part three (What’s in a Shard?) of this series, you can achieve a scalable and fault-tolerant architecture, but one […]

Sharding: In Theory and Practice (Part Four)

Part Four: Using Memcached Welcome back to our blog series on database sharding. As I mentioned in part one of this series, memcached was invented at LiveJournal, and its purpose is to reduce the number of redundant reads hitting their databases. LiveJournal observed that 80% of the traffic accesses only […]

Sharding: In Theory and Practice (Part Three)

Part Three: What’s in a Shard? In the first two posts of this series, I offered a perspective on the origins of database sharding and described the architectural problems with algorithmic sharding that led LiveJournal and TypePad to use dynamic sharding to scale. The next challenge of a sharded architecture […]

Sharding: In Theory and Practice (Part Two)

  Part Two: The Differences Between Algorithmic and Dynamic Sharding In my last post, I pointed to the LiveJournal model as an example of sharding on which many recent Internet companies have based their own implementations. To understand the design decisions of a sharded environment, let’s discuss the differences between […]

Sharding: In Theory and Practice (Part One)

Part One: A Brief History of Sharding Peter Zaitsev’s keynote at PerconaLive NYC 2012 contained a slide with the text, “sharding is messy.” This admission felt like a tide change to me because so many high-growth technology companies today are firmly entrenched in custom-sharded solutions into which they poured their […]