Using Disk to Scale

One of the largest concerns when developing an infrastructure for a site as large as Comcast.net is determining smart ways to scale. By smart, I mean requiring the least amount of effort to launch new channels or services. Each new channel or page can draw thousands, if not millions of page views. You need to plan for it.

When growing Cofax at Knight Ridder, we hit a nasty bump in the road after adding our 17th newspaper to the system. Performance wasn’t what it used to be and there were times when services were unresponsive.

A project was started to resolve the issue, to look for ‘the smoking gun’. The thought being that the database, being as well designed as it was, could not be of issue, even with our classic symptom being rapidly growing numbers of db connections right before a crash. So we concentrated on optimizing the application stack.

I disagreed and waged a number of arguments that it was our database that needed attention. We first needed to tune queries and indexes, and be willing to, if required, pre-calculate data upon writes and avoid joins by developing a set of denormalized tables. It was a hard pill for me to swallow since I was the original database designer. Turned out it was harder for everyone else! Consultants were called in. They declared the db design to be just right – that the problem must have been the application.

After two months of the team pushing numerous releases thought to resolve the issue, to no avail, we came back to my original arguments. The terrific thing was that restructuring the database was a no pain affair – we had a terrific service layer between the main web tier and the db that hid its schema. We were able to deliver a release of the database that did not require any code changes on the web tier.

There is no silver bullet here, for smaller sites you are adding a great degree of complexity taking this route and it is, most likely, not advisable. However, if you have a large site that is thrashing – dealing with the demands of growth – take a hard look.

Related – and supporting of this:

High Scalability: “How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale”.

High Scalability: Scaling Secret #2: Denormalizing Your Way to Speed and Profit

Dare Obasanjo: When Not to Normalize your SQL Database

One thought on “Using Disk to Scale

  1. Thanks for sharing. What was the performance increase after the denormalization?

Comments are closed.