Going Out
Divide and conquer
Many times you start scaling out without knowing it and this type of scale out is the opposite of consolidation. You identify those parts of your system that are resource intensive and move them off to be on their own. Its just that simple... right? For many scenarios it is just that simple. When you are faced with buy bigger machines or buy more machines, it is often considered safer to buy more machines. If you can logically break apart functions in your system into discrete applications that just happen to look like one big system, you are in great shape.
.oO(Imagine how much easier it will be to upgrade application 'X' when it lives on a machine of it's own.)
Over the past 25+ years, I have seen consolidation followed by separation followed by consolidation... and so on and so on. I don't think this will change in the near future. ( wow.. this whole scale out thing is kind of boring ... ) The industry has made it easier to manage parts of process by doing what? Virtualization... another chapter from the mainframe days. It is amazing what you can do with vitalization, but it isn't the topic I plan on talking about....
Forms
'But wait, that form of scaling out works well for groups of data and/or applications that can be separated off to their own machine. I have a need to hit my data store with (make up some number) 80 web servers. They are all producing dynamic content based on data in the database, the data cannot easily be broken apart.... what am I supposed to do to scale this out?'
To make sure I scope this correctly, I am not talking about a MSN, Yahoo, or EBay sized site. That type of scaling out is well beyond the scope of these blog postings. I am really focusing on systems that may have a few million users a month which need to get content that requires 'a lot' of database server resources and are currently pushing your database to its breaking point. This is the meat of the scale out problem I am talking about. More specifically, when your having problems scaling out your system, it is usually related to accessing a data store, not web pages with static content, so most of the discussion about scaling out will focus on how to increase the apparent throughput of your data store, not on how to add more web/front end servers.
There are three basic building blocks used for scaling out. They can be applied in different ways to make very complex architectures, but there are still only three. They are:
- distributed
- fanned out
- fanned in
Distributed (Many-to-many)
When you have a data store that is distributed, you can read or write to any logical component. The result of each write is copied across all of the machines participating in the distribution. This is often referred to as peer-to-peer replication. There are also permutations of this that are used for special purposes, like disaster recovery, where the writes only occur against one system until that system is not available, then the writes are sent to another system.
This type of architecture is supported by many off the shelf products, like SQL server.

Fanned Out (One-to-many)
When you have a data store that is fanned out, you write to one logical 'master view' of your data and copy that to all of your 'read' servers. Your application supports a higher throughput by using many read servers. This type of architecture is also supported by many off the shelf products, like SQL server.

Fanned In (Many-to-one)
When you have a large amounts of write traffic and need a single view of that data, you often use a fanned in architecture. This architecture is often used for data collection. There are few, if any, out-of-the-box solutions for this right now. There are plenty of tools that help you build this yourself, like SQL Server Integration Services (SSIS), but you still need to create custom logic to make this work.

These three forms are the basic building blocks for making complex systems (even clouds). Each one has pros and cons associated to them. You can treat any database in these diagrams as a logical database which may be implementing scale-out when physically implemented.
I wanted to present some very general ideas about scaling-up and scaling-out before going on to real issues we have had to address in performance. Now I can get into some more specific details.