Long ago in the early days of DoubleClick (really early) we built a data center out of our mail room. We filled it to the brim. It would get very warm and we used up all the power in the building.
Obviously, this is not ideal. We then bought some hosting from a class 1 data center provider : raised floors, redundant cooling, power generators, UPS room, etc. We had hardware running in both locations.
Something strange then happened. The “real” data center had far more outages over the next year than the mail room.
Not much has changed. Data centers are not all that reliable. 4 nines at best. Yet failures are rare enough that most software systems don’t cut over well. They are rare enough that there isn’t enough testing of failovers, and rare enough that one doesn’t always get around to all the extra work this would require anyway.
4 nines, yet we want more. 5 nines would be nice — but can you really get there? And at what cost. There is another option : 3 nines! That is to say, either make your data center really, really reliable, or don’t bother at all.
Let’s call this approach a “Redundant Array of Inexpensive Data Centers” (RAIDC).
- + Cheaper. Forget the generators, forget the UPS room, forget multiple egress, forget everything fancy. Just let it go down. We have more data centers, and we know how to fail over.
- + We can then actually fail over successfully. It then happens enough that we really have to make the failover work.
- - We do though then have to build a system that actually can fail over to a new data center. That requires work and cost. But didn’t you want that anyway?