Welcome to MSDN Blogs Sign in | Join | Help

SharePoint Server and the Deathly Hallows

Just finished the latest HP book, not Hewlett Packard, Harry Potter and the Deathly Hallows.  In addition, I enjoyed the Simpson's movie outtake on Spider Pig and Harry Plopper.  You can learn a lot about SharePoint from the movies.

You're wondering why I'm discussing movies and books?  This week I'm on vacation.  I just finished Harry Potter and the Deadly Hallows, I won't give anything away as to whether it's Harry or Voldemort.  I will give you a list of things I'd consider the Hallows.  Things very poweful, but those that require thought and planning to wield.

The Hallows as those of you who've read the books know... contain a ton of power to be unleashed in the hallows.  Only those who understand and appreciate their power can properly yield it.  There are 3 hallows all of which relate to proper capacity planning...

1. Large Lists/Records Repositories - The million + item list is powerful, but requires knowledge.  I recommend starting with the scalable lists paper and capacity boundaries article.  Don't overestimate performance considerations, both adding items and removing items in really large lists takes thought.  Planning to use folders and indexed columns are a must.

Common mistakes - a programatic interface adds millions of items with little differences between them.  These would then be difficult to work with in the UI, and difficult to consume.  Other mistakes include dumping large file shares into lists without rhyme or reason or planning for folders structure, meta data etc...  Also now with email enabled lists, pointing chatty DLs (distribution lists) at lists without proper planning. 

Lists truly are the basic container of structured and unstructured data (tables of information), sure they have folders, but it's the list that contains the properties/columns/meta data and content types and granular security.  Very powerful things.

2. Massive Databases - The multi hundred GB to TB database(s) is a powerful thing to wield.  They are easy to create, but take planning to properly backup, consistenly restore and especially performance wise (from a backup and blocking perspective).  (I do continue to recommend 50-100GB databases.  Even when hosting TBs of databases and using Log Shipping or Mirroring.  Snapshots and DPM vnext may change my thoughts on the backup side of this.)  MS IT has had great success with DPM (data protection manager) with the latest builds and is considering turning off their SQL style backups with SQL litespeed and increasing their databases into the 200-300GB range (potentially).  More on this later as results become available.

Common mistake - no backup plan, so backup goes into production live hours impacting service performance.  Tape backups span multiple tapes that are not validated thus creating complex restores and likely failures.  In addition with large production databases and lack of planning, huge site collections could create blocking when massive site collections are deleted.

3. Massive Site Collection - Whether this is your portal, your wiki, your site collection with tons of sites underneath it.  This is controversial as it is easy to create 100GB site collections, but managing the storage properly again takes planning.  Start with the link at the begininng and then follow up with Bill's post, which although no credit looks similar to mine, but provides further clarification. 

Common mistake - The most common here is a single site collection likely an intranet portal.  I don't see the publishing portals hitting this so much.  A large internet site is 10GB, this isn't what I'm talking about.  I'm talking about the 100GB site collection.  Still this can be done well, but takes planning.  Planning at this level means utilizing more than a single list, see first hallow, and since it's all in one site collection means hitting number 2 as well.

To avoid all 3 hallows is the essential collaboration environment and most small to medium sized organizations.  The most likely to hit these are the large enterprises looking to use it for records repositories and those that didn't read the capacity boundaries article and may be migrating from another environment thinking, hey one place is better than more.  Environments that try to combine collaboration and portals can attempt to put all things in one place causing a struggle for storage and planning management.  It's for this reason I do recommend in enterprises or environments that push these recommended boundaries to divide their portal from their collaboration environment and using lots of site collections as the units for team collaborations and project workspaces.  You'd ask... "From a search perspective does it really matter?"  You can split the data into site collections and search can return the data across the site collections in SharePoint Server, so yes do take advantage of that for better planning using the management tools like quota, ownership, and features.  Now from a security perspective spitting it up is likely not as often the forcing function.... planning must be.  Internal collab vs external open collab might be.  You'll note that # of site collections is not a hallows, SharePoint scales very well here and is easy to handle.  Another is numbers of lists in a site, it really isn't that common to hit limits here, but although you can't put a quota or force this, it isn't common to hit issues with this.  Watch for these Hallows and wield them wisely.

Published Thursday, August 09, 2007 5:44 AM by joelo

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

Thursday, August 16, 2007 7:37 AM by Blog del CIIN

# WSS 3.0 & MOSS: Recopilación de enlaces interesantes (V)

Después de unas semanas de vacaciones (cortas :PPP), aqui estamos de nuevo en el CIIN al pie del cañón

Tuesday, August 21, 2007 5:24 PM by Darrel

# re: SharePoint Server and the Deathly Hallows

I'm a bit confused as to how one properly plans all of this up front.

Site collections, it seems, are really about separating logical groups of content on your sites for ease of permissions management and back-up restore.

So, HR might have a site collection and IT might have one.

But who's to say HR might only have 20gb of files ever while IT might rocket to hundreds and hundreds of gbs of data?

How does one both plan for logical separation of content AND accommodate these size limits? Should we give every department 10 site collections to play with as needed? When one fills up, go to the next? That doesn't' make any sense to me from an IA standpoint, but seems to be necessary if there are these 100gb limits all over the place.

Wednesday, August 22, 2007 2:43 AM by joelo

# re: SharePoint Server and the Deathly Hallows

Darrell,

My recommendation is to separate your division portals in your top down deployment from your team and group collaboration if you're planning to store hundreds of GB of files.

If you're referring to a knowledge management system/knowledge repository, you'll likely have someone involved in the management of the storage and how the data is separated.  As the repository grows you should consider archiving, workflows, and mechanisms to keep the relevant content available.  100GB is a lot of data.  If you do need 1TB in a single site collection, it is possible to accomplish this, but it does take some serious planning.

What I note is people getting into problems where they aren't prepared for TBs of data.  They simply dumped all the data in one location without planning information architecture (Sites & Site Collections).

Wednesday, August 22, 2007 10:31 AM by Darrel

# re: SharePoint Server and the Deathly Hallows

Thanks for the advice, Joel! I appreciate it!

Wednesday, October 22, 2008 7:07 AM by Alex blog about Microsoft

# Top SharePoint Storage Resources by Joel Oleson

Thanks to Joel we have a resource list for SharePoint storage. A part of the blog article contains whitepapers

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker