Some of my fellow analysts have decreed that, since web-scale organizations like Facebook and Google can run their complex and mission-critical applications at a much lower cost than enterprise data centers can crunch equivalent bunches of numbers, we should all run our data centers the way web-scale guys do. I don't think so.
Web-scale organizations like Facebook and Google run a relatively small number of applications, each of which supports a very large number of users. Yes, Google has tens or hundreds of applications, from the search engine itself to YouTube and Google Mars, but Google applications have to be embraced by hundreds of thousands of users, or, like Google Reader, they'll get the ax.
By contrast, the corporate data center is filled with hundreds or thousands of applications, and many of them may have just 50 or 100 users. Applications such as MRO (maintenance, repair, and overhaul) or HR's benefits calculators abound.
When you're building web-scale applications, every hour of developer time writing the application will be amortized across the thousands of servers and millions of users the application is designed to attract and support. As a result, it's more cost effective to make resiliency an application function, rather than relying on the infrastructure to provide resilience.
If it takes 2,000 programmer hours to write each uploaded photo to three independent servers and simply to try server No. 2 when a read of server No. 1 fails, that's a lot less than the cost of a SAN to provide the same resiliency.
In the corporate world, we buy, rather than write, many of the applications running in our data centers. Those applications, and application platforms like SQL Server or Oracle, are designed to run on a resilient infrastructure. If SQL Server can't read its databases or write its transaction logs, it doesn't try to access the second copy -- it fails. Hopefully, it fails over to an always-on cluster partner.
Even for the applications an organization develops itself, application complexity and the additional testing that complexity would require would make the resiliency at the application layer model work only if that application were sharded across a large number of servers to serve a very large number of users. After all, the 20 TB or 50 TB of resilient disk space a new HR application needs would cost a lot less than that 2,000 man-hours of coding and testing.
It's also important to note that, though web-scale applications are designed very differently from corporate applications -- even huge corporate applications like an airline ticketing and reservation system -- the web-scale organizations use much more conventional architectures to manage their own businesses. My sources have confirmed that Facebook runs its financial, payroll, and HR applications using database servers that run SQL and store data on dedicated storage arrays.
In short, as the old saying goes, "Bet different horses on different courses." I just don't believe the web-scale application model works for many of the applications in the corporate data center.
Now, should we run our data centers more like cloud computing providers like Rackspace or HP Helion? That's a more interesting story.