Fresh Content on SharePointJoel.com SharePoint Ads
Subscribe in a reader
Something that's tough to find any information on is Site Collection sizing. Let me offer up my thoughts on the topic and share some history.
The largest single site collection I've come across was 600GB all in a single SPS 2003 portal (top level site collection). The largest web app I've come across is 3.5TB which contained 12K site collections in 70 databases across 2 named SQL instances.
When all the data is in one site collection the data is in one content database. Site collections are commonly located in /sites or some other inclusion path. You can recognize a site collection by the fact that it has it's own quota, owners, and when you're on site settings, you'll see site collection administration features like usage reports, storage management, audit reports, etc...
History... In the release WSS 2.0 a site collection could not be deleted if it was larger than 2GB, you actually had to delete enough data to make it under 2GB. At SP1 this was corrected. There are a couple of KBs on the topic including a link to the hotfix. The WSS FAQ site has some additional background around the blocking that can occur in WSS 2.0 during long running operations such as deletions and site collection backups. There are no 25 or 50GB limits, these are simple legacy recommendations based on Tape backups in IT. In MS IT in WSS 2.0, at RTM the max site collection size was limited to 2GB. After the hotfix (included in SP1) they increased it to 5GB. This was a number I came up with when trying to determine what could safely be backed up and deleted quickly (within 20 minutes under heavy load). Under certain circumstances we arranged for specific businesses to go beyond our own internal recommendation of 5GB and allow 10GB and even had one that was 20GB. When I was in the hosting business, we decided we really needed to be able support even larger site collections. Our comfort level after some testing was to put these large site collections in their own content databases thus preventing any needs for running any site level backups with stsadm or the need to run any site delete operation (removing the content db via the UI and deleting the database if we ever needed to would suffice).
In WSS 2.0 I recommend staying with 5GB-10GB as a maximum unless you have site collections in their own databases (I realize this doesn't scale). It's taken years to have a few hundred sites out of a hundred thousand to reach 5GB. With planning very, very few were allowed to grow larger thus keeping the number of content databases to around 60-70 content databases for 3TB. From support I hear 8-10GB is common on the large size for individual site collections.
The great news, in WSS 3.0 site collection long running operations are tons more efficient. My buddy Mike over in IT ran tests with 15GB site collections.
15GB stsadm backup < 1 hour15GB stsadm restore = 1 hour15GB stsadm delete site collection = 19 minutes
His comments: Operations are completing successfully, and performance has improved.
Although I'm very encouraged, the 19 minutes is pretty close to my 20 minute comfort level. From a SQL perspective I don't like seeing long running operations that last more than 20 minutes. In a dedicated database I'm more comfortable with locking. My new 5GB is 15GB for WSS 3.0. Note: *There are No hard coded limits for maximum site collection size.* For MOSS 2007 (Intranet Portal, Internet Sites, etc..) and WSS 3.0 top level site collections, I'm comfortable with these large site collections in their own databases scaling into the hundreds of GB and more. This recommendation is more for managability of the sites and databases. In addition, the import/export stsadm commands offer up alternate options for splitting up site collections if necessary, with full fidelity an improvement beyond smigrate. I recommend staying under 256 databases maximum per active SQL instance, and 100 content dbs for any particular web app on one SQL instance, for performance and managabilty reasons. Connection pooling becomes less efficient as the numbers of databases grow.
When I say dedicated database, I'm suggesting that the "maximum number of sites" on the content database be set to 1. (Sites in this context is referring to site collections...)
How do I get the site collection into it's own db? These steps are very similar for both WSS 2.0/SPS 2003 and WSS 3.0/MOSS 2007 (You may want to turn off self service provisioning while you do this)
1) Backup the site collection 2) In Central Admin Content Database Administration, Create a new content database, then set the maximum number to a very high number like 50,000 way more than the others 3) Backup the content database 4) delete the site collection 5) Restore the site collection 6) Set the maximum number of site collections to 1 (should be the same as current number of site collections)
Some great reading on capacity boundaries on TechNet in the planning performance section. Notice the create site vs. Enumerate site chart, then scroll down to throughput vs. Site Collections chart, then flat doc library and Indexed view vs. View by folder. The key to all these? You don't need 10's of thousands of site collections... well maybe you do if you're talking about My Sites in a large enterprise or Digital lockers for a huge campus, otherwise 10K goes a long way and not all these site collections need to be in the same database. I don't expect to ever see 50,000 site collections in a single database. You'd be suprised what 250 site collections in a database can grow to. I don't want to get into scaling lists right here, but do know that this content is available and holds the keys to scaling them out.... Indexed views, and/or folders.
Summary:
There is no hard limits post WSS 2.0 SP1/SP2. In WSS 2.0/SPS 2003 I recommend you pick a maximum site collection quota of no larger than 10GB where sites are in shared content databases or not the top level site collection.
In WSS 3.0/MOSS 2007 I recommend you pick a maximum site collection quota of no larger than 15GB, excluding the top level site collection and to move sites to dedicated databases where they need to grow beyond this. SharePoint content databases can scale to hundreds of GB and SharePoint farms can handle TBs of content. These recommendations are based on my experience with MS IT and in conversations with product support, and your experience may vary.
Additional Resources:
SharePoint Server 2007 Site Directory - Want Site Collections?
What you need to know about SharePoint capacity planning in 2003 & 2007 (Good section about site collections and content databases)
How large for a single SharePoint content database?
TechNet: Plan for software boundaries (Office SharePoint Server)
<Update 1/31/07>
TechNet: Planning for enterprise content storage (A lot of this planning data came from perf testing and from the perf team. Your own results may vary.)
WSS 2.0 IT Documentation: Configuring Site Collection Quotas (this applies to WSS 2.0 only, but Quotas and Quota templates didn't change much between WSS 3.0. The UI has changed). NOTE: The SQL update command here I doubt you'll find anywhere else. It's an extremly rare exception and I even think product support would recommend you not run the commands. You can update site collections individually to use higher quotas. The rub here is.... WHat if you started out with a template with a really low quota, then later decided you wanted to increase the quotas, en masse. Obviously we recommend doing this via the OM since there isn't a UI to update lots of sites using existing quota templates. Those commands do work if you do everything right. If you're not a SQL guru, totally stay away from them. You should never do any inserts or updates against the real production database. I was really suprised to see it when I was looking for other info on this topic. It is a contradiction. A good SQL tip: Always do a select for the statement first.
</update>
Is there any way to hard code document libary maximum size set?
Mister " SharePoint Administration " himself, Joel Oleson , has written a nice post about Site Collection
Thanks for the reference to the WSS FAQ item. I've added your specific comment/correction on that particular part of it and a link to this blog item for good measure.
Mike
Mike, thanks for updating your post. I've cleaned this up appropriately.
j, the easiest way you could limit a library would be to limit the quota would be to have the library in it's own site collection. Otherwise, building your own event based limit you could. There is no limit, but having an event that checks the numbers and limits the number of items would be possible.
Joel,
I'm not sure I understand why you would restrict a WSS3.0/ MOSS 2007 site to 15GB, but leave the top level site collection unrestricted and be happy to let that grow to hundreds of GB. Doesn't this cause problems - is the real reason for the 15GB limit elsewhere just to keep small site collections small and manageable? If the top level site can grow to Hndreds of GB wouldn't that make the prospect of backing it up / recovering it rather worrying? I'm just trying to figure out what recommendations to make for an organisation of 2000 people, with file based content of around 500 Gb at present. I'd like to be able to use site columns and content types across the whole organisation, but this looks impossible with the 15GB limit, but fine if I just put everything in a single large site collection. Any thoughts?
Who owns information architecture in a SharePoint deployment? Is it IT? Is it the business? Well, who
Gareth... here's why...
Using the STSAdm tools works well for smaller sites. For massive sites (15GB+) it's better to not have to rely on stsadm. The command line tools scale well up to the tens of GB, but it's dog slow the larger you get. You'll also start to see blocking, dead locks, etc... You need to accomplish these long running processes in a reasonable amount of time.
Why do it at the content database level? SQL tools were designed for large multi GB/TB databases. You don't see the locking when backing up at that level. The SQL tools were designed for that type of consistency and online backup.
The better you understand the impact of long running processes on your farm and how it impacts the performance of the server as well as end user performance you'll start to see why the super large site collection operations with stsadm aren't as clean as normal database operations.
All this still doesn't change the fact that you CAN have massive site collections all in the Same database. The question ends up being... at what point do you want to split for optimization and isolation. It's a performance consideration. Not something that's completely obvious or well understood. It's not well understood and is difficult to communicate, since customers have such different requirements.
We are currently in the process of implementing a MOSS site here in Vienna for a client that has TBs of data on properties they own throughout Europe (pcis, contracts, accounting, etc.). We have decided that we would like to build seperate Content DBs for each country mostly due to backup and restore issues. Problem is, when we have multiple site collections, there is no easy way for the enduser to view libraries from one site collection in another other than a dataview that is very limiting.
Any suggestions?
le nombre revient souvent dès que l'on parle de SharePoint et des conseils d'utilisations. Du coup, on
J'insiste souvent mais la structuration du contenu dans SharePoint est un point I M P O R T A N T. Une
Siguiendo con la recopilación de recursos iniciada en un post previo , en la nueva entrega de recursos
Just finished the latest HP book, not Hewlett Packard, Harry Potter and the Deathly Hallows. In addition,
Let's start with a confession: Although I'm considered to be a huge MOSS fan, I've grown to hate anything
Body: It's been a bit late but here are some resources from my presentation of OFC 418 that I co
With a number of recent simplified releases I wanted to share what I'd call the SharePoint Deployment