Database Repartitioning with MergeContentdbs
Just wanted to do a quick post on database management. Todd Klindt did a great post on moving site collections in a single bound that refers to a very nifty new stsadm command intended for SP1, but included in the October 9 public update called mergecontentdbs with details in KB934525. This command allows you to easily and yes, I mean easily move site collections between databases. Managing your sites between databases isn't something that's new with SP1 or even new with this version. Splitting databases is something that Keith Richie was doing with SPSiteManager and IT was doing with WSSSplit.exe back in the WSS 2.0 and SPS 2003 days.
Why would you ever want to split your databases?
If you do nothing to create more you will have one really large content database. You'll also have one large site collection (since SCSignup is off by default in WSS and the Site Directory creates sites not Site Collections by default). So in this worst case scenario where you end up with a 500GB single site collection you might end up finding the database a bit unwieldly from a management perspective. Sure it's possible to have large databases and having a single site collection is the dream state right? Maybe from a content perspective, but not from the day to day management perspective. For 500GB of data you'd find it easier to manage 10 50GB databases, quicker to restore, quick to attach, quicker to detach, and with a multi threaded backup you may find it 5X faster to backup and restore or more. You might also find that creating and deleting sites is slow or even causes SQL blocking in some cases. If you're looking for a good database max size, start with 50 or 100GB. Even with a 5TB SQL server you still only end up with 50 databases, so it really is a decent number to work with. If you change your mind later you can consolidate your databases with mergecontentdbs.
So let's say you turned on the SSC (Self Service Creation) and enabled site collection creation could even be for a specific group, or you created a managed path and enabled it for your site directory creation. Now you end up with lots of site collections. Again by default these all go into a single content database per web application. Sure you should use quota templates thus giving site collections a default quota as they are provisioned. By default you'll end up with 15000 site collections all in one database before it stops you. By no means do I recommend 15000 site collections per databases. Even if your site collections are small I'd recommend somewhere around 200-500 site collections and 50-100 for medium and some databases are great with 1 site collection for those that are going to grow beyond 10GB. Might as well start off putting them in their own database when you know they are going to grow big. STSADM has a command for you here as well ("stsadm -help createsiteinnewdb" for the syntax). You can create site collections in a new content db.
So you're in a situation where you have 500 small sites and one massive site but you find STSADM backup and restore failing to complete or when you do back it up and restore it, it's corrupt. Yep this does happen. Especially with the big ones. My suggestion is even though it may sound counter intuitive, you should move all the small ones into different databases and leave the large one. You'll find it more reliable and it also gives you a chance to split the rest of the site collections as you go.
What you can't do is split sites between databases with STSADM. Two content databases can't share the same site collection. Sites can be promoted. You can use STSADM -o export (be sure to include security) to backup the site and then import it to a new site collection. That new site collection could be in a new database.
I did a instructor led lab at IT Forum where in the class we worked through figuring out how to target new site collections to a specific database without closing off all the other databases. How? Set one database to a super large number say 30,000. All new site collections will be created in that database with the most "capacity" for site collections. Don't believe me? Try it. Prove me wrong.
With all this thought on managability, guess what else Todd did? He posted steps on upgrading from SQL Express to SQL Server Standard or Enterprise. He's my hero for the week. Not only because of these posts, but because of his great STSADM and SQL talk in Barcelona. He also lent me some money when I lost my wallet. Good guy! Hats off to you Todd.
Terms:
Web Application - IIS Virtual Server, IIS Web Site, Application
Site Collections - Top level sites, Site as seen in central admin UI and often in STSADM, SPSite in OM, parent sites
Sites - Webs, site, sub site, sub web, SPWeb in OM, child sites
Workspace - 1) Type of site collection or site. 2) Template such as document workspace or meeting workspace
SITE DB = Content Database (site database is a SPS 2003 term for the content database)