Is the File Server Dead?
I have seen a number of threads that have passed by suggesting that file servers are dead with others saying they are confused. Let me take this head on...
Collaborative File Shares are on their way out... to SharePoint doc libraries
The file servers that end users are use to using by copying a word doc and sending a link to an associate or friend are on their way out. The U: drive or M: or whatever it might be in your company for sharing your collaborative data and the S: or N: drive or whatever it might be for team file sharing the days may be numbered. Team shares currently on long UNC paths (\\mystorage\users\joelo\docs\ and \\myshared\marketing\collateral\) for sharing office or collaborative files... there are more efficient ways of sharing those files making them easier to find, consume, and easier to use with contextual collaboration.
Before the party breaks out and the monolithic servers look to move all their data into the SharePoint platform, let me try to hold these virtual horses. File Storage is *NOT* dead.
File Storage is not Dead
There are many extremely useful scenarios for intelligent file storage. Let me first divide file storage from file sharing. The end user file sharing of collaborative files is a great scenario for the SharePoint platform whether it happens to be Windows SharePoint Services or Office SharePoint Server. Let me break down the some top features of document libraries and document centers vs. an intelligent Windows Server 2003 file server.
Table: Simplified Comparison of File Server file features and SharePoint Server file features
|
Windows 2003 R2 File Share |
SharePoint Server Document Center |
|
ACL based File Security and Effective permissions (AD only) |
Authorization based Item Security with user picker, supporting AD, LDAP, and .NET Pluggable providers
Opt in email based Request Access |
|
Windows Auditing |
Security and Policy based Auditing, expiration and pivot reports |
|
Shadow Copy User Restore
(not configured by default) |
User Restore with Recycle bin
2nd Stage Site Collection Recycle bin (default) |
|
Distributed File System Replication
(not recommended with two way editing)
WAN Throttling
RDC (Remote Differential Compression) |
One Way Content Publishing paths and jobs including quick deploy
# Threads throttling and scheduling
Multi Farm Shared Services (not over WAN) |
|
Email enabled (requires configuration) |
|
Check in Check out w/ Forced Checkout |
| Snapshotted versions (not change based versions) |
Version History/ Major/Minor Versions |
|
File level Rights Management |
File and Doc Library Rights Management Integration Policies |
| Sorting, (Grouping in Vista), Workflow Engine (requires customization) |
Filtering, Grouping, Workflow (out of box), Content Types |
|
File Service Resource Manager for Quotas or 3rd party |
Site Collection Quotas, Built in Usage Reporting, Storage Manager |
|
NTFS Compression, EFS and My Documents Redirection (client dependencies) |
Database Encryption with Third Party, Backup Compression with Third Party |
|
Non Transactional. No Rollback without Shadow Copies |
SQL Server Transaction Logs |
After that glowing file level analysis, there are some other things to look at. In SharePoint technologies with transaction level logging, and storage in relational databases there is overhead associated. As a result file storage in most cases will be cheaper than storage in SharePoint or SQL databases in general. When evaluating the service for your end users try to understand the value to the business.
There are some scenarios that are clearly better served by Windows Server file services. This is why you will likely still have file servers in your company. For the previous version of the product I did a blog post called "What Not to do on SharePoint" which was a list for a customer I was working with. I think these scenarios now cover that list fairly well with a few updates relating to the new platform.
File Server Scenarios:
Product Distribution (Product packages like Office) - by default Windows file servers have a great mechanism for transferring large packages. SharePoint lists work well with files under 50MB and can be used up to 2GB, with configuration. When you roll out Office 2007 to your end users it will most likely not be stored in SharePoint doc libraries for example, but the page or list for communication and link to install it from might be.
SMS distribution point (desktop patches and hot fixes) - for hot fixes, patches, application distribution such as add/remove program distribution points are much better served directly from a DFS Distributed File Services file share. With optimization for your WAN leveraging the DFSR (replication) technologies packages can be distributed and optimally updated across the wire. This same DFSR technology does not work well in a multi master scenario where multiple users are working on the same files due to lack of ability for scheduling or remote locking.
NT Backups, Backup Servers and Desktop Backups (backups) - Many corporations that use "My Documents folder redirection" with group policies may wonder if a backup of their desktops or redirection to a SharePoint site is a good scenario. This is an untested scenario. Creating mapped drives to the web folder location of your my site may work for some users, but it is not recommended to create policies for your corporation. With disk based backups file storage is going to continue to be a commodity inside a corporation. Corporations that desire to migrate this scenario from file servers can choose to force users to keep their master documents in their “my site” with offline in Outlook 2007 or Groove 2007. If you can get users to store their important or business critical personal files on their SharePoint my site, the arduous desktop backup task may not be as necessary (depending on the scenario/corporation).
Database Storage - (.mdb, .ldf, ndf, .pst, .ost)- SharePoint lists are incredible ways of capturing data or displaying information from your information workers. These flat structures can be extended now with lookup columns simulating a simplified relational storage. At this stage relational databases should not be considered to be stored in SharePoint lists, but via the Business Data Catalog, data can be displayed or indexed or leveraged for column validation or as lookups. Databases such as SQL databases themselves are not good candidates for storage in a document library. Files that require locking or that have transaction logs would be more appropriate for storage on the file system. If your data needs triggers or stored procedures you may look at the workflows and events as mechanisms for this, but it is not supported to create triggers or stored procedures inside the SharePoint databases. Access databases in Access 2007 have a number of ways that they can both be published to SharePoint lists, consume lists, and display reports. In these new Access 2007 scenarios you will need to determine what is the best scenario for the storage. Access 2003 databases should not be stored in SharePoint document libraries where multiple users need to edit the access database simultaneously.
Large Audio/Video and Streaming Media and other large archive read only media such as DVDs, CDs storage (.iso, .wmv, .ram, .vhd) - Media can easily be linked to from a SharePoint site, but often the storage is simply more costly. Unless the audio is contextual with the collaboration or workspace the or under 50 MB, you may consider leveraging Windows Media Server for streaming the content which will have a better experience for your end users and be more manageable for those trying to distribute the media. VPCs could fall into this scenario very easily. If you have a 5GB VPC you are trying to share with a group, a temporary location on a file server is going to be cheaper and efficient. Inside your company you may decide to increase the default upload file size from 50 MB to 100 MB, or maybe if you collaborate on large files you could increase this up to a maximum of 2GB. Most companies have found that 50 MB or 100 MB has encouraged users to use the platform appropriately.
Developer Source Control - Although versions, and version history and check out are common terms for developers working on projects, the SharePoint platform is not the best use for source control. Likely a source control application like Visual Source Safe, Source Depot or other third parties. The storage may be a database or sit on the file system. Solutions can be packaged and stored and distributed if they are small and common within a team. Visual Studio Team System has it’s own source control tools but, it also uses WSS for collaboration for specs, tasks and document storage.
Batch, Command Scripts, Executables (.exe, .vbs, .cmd, .bat) - By default most scripts and executables are blocked by default in SharePoint. This is for your own protection. You may find that most of these file types are blocked in email by default as well. This is to slow down the distribution of virus prone files. It is recommended to install an antivirus product such as Forefront for SharePoint for WSS 3.0 and MOSS 2007 or Antigen for WSS 2.0 and SPS 2003. With the feature of "blocked file types" you can decide what you do with these files. Based on how they are executed via the web UI or via web folders many executables with other dependencies may not work. For this reason, you may want to package any small applications or simply block them from being added to the SharePoint lists.
Application Server... Client Application Storage Linked Files and File Dependencies - (.lnk, .lck) A lot of this scenario has to do with client applications and how they interact with the file type. The protocols are very different and behave differently. They both have their pros and cons. For example some AutoCAD drawings that may have dependencies in other files that expect a path that is over SMB and not HTTP or relative will break and not properly render. You may notice when loading a Visio with a template that some template dependencies may not load (this is not very common). As well, Excel files that link cells between spreadsheets may expect a specific file path that is not there and the link will not load properly. In the Excel example there are now new Excel 2007 publishing scenarios that enable calculations, spreadsheets, and charts to render on the server. If you have multiple data in multiple spreadsheets that is business critical you may consider moving this and storing this in a data warehouse and creating queries, using analysis services and surfacing those in Excel Services on SharePoint Server. Another example of dependencies is the lock files which require an .lck when in use. Editing these files may experience problems or simply not lock when editing. I recommend using the check out then downloading the file and uploading and check in for users of files that require a lock. Most of these scenarios will end up as FAQs rather than blocked file types.
Archives and Dumps (.arj, .rar, .zip, .dmp, .bak) - Since file storage is cheap and plentiful, and your SharePoint platform storage is the premium, you can make some decisions about what type of storage is leveraged for database dumps. Obviously copies of production databases can easily and cost efficiently be dumped to disk. Disk based backups are becoming more and more common for backing up your application servers. Being able to do a point in time backup and recovery very quickly from disk is an important for keeping short business continuance SLAs. Tape vs. disk is the next question. Offline vs. near line vs. online is pushing near line to non existence and with business critical data the offline is getting pushed further and further out and eliminated in some cases by remote failover. Your SLAs will drive the disaster recovery, business continuance… distance, failover and offline storage strategy.
Exception: For most archives it is cheapest for online storage to be on file storage, but for resources that need to be quickly searched and indexed a SharePoint server will provide more accessibility for a cost. For example... your legal team wants to create Exchange journaling for mails that have a legal retention placed on them. With Exchange 2007 and managed folders the emails can be archived to a list which can be used to search for emails with this requirement. As well, with records management scenarios the SharePoint lists themselves may need to be archived, but still be accessible. These online archives can be done on cheaper disks with less backup frequent requirements or even 97% SLA since they need only be used very infrequently.
Small zips as well can be an efficient way to storage a small package on SharePoint and many end users know how to deal with them with built in compression utilities in recent desktop OSs.
Disk Space and Cost Considerations
I have mentioned cost a few times; consider RAID 5 volumes for your SharePoint Data drives and RAID 0+1 for transaction logs when comparing the systems. Although SharePoint server storage is SQL server, the disk I/O for the content databases is extremely low compared to Exchange drives for example, with the exception of the search database. Many enterprises are now using RAID 5 for their content databases. The backup methods and differences between the SQL backups and the File server backups do both have cost implications as well as software costs associated.
Summary
Collaborative file shares can be replaced with SharePoint deployments. Product distribution and database storage will continue to persist as valid scenarios. End users will need training to understand where to save their files. With most file sharing scenarios for the most common file sizes SharePoint lists will be the Microsoft recommended way of sending files inside the corporation and with collaborative SharePoint site extranet deployments, it’s the way to share with partners. Most non technical end users scenarios such as the most common HR, Sales, and Marketing teams can say goodbye to using file shares for file sharing. Some groups and divisions like IT SMS/Product Distribution, Data Warehousing (SQL), Media, and Development groups won't be saying good bye to file servers in Windows 2003 and in code name “Longhorn” with key scenarios leveraging cheap NTFS file storage.
Analyzing your current file servers by server or share or folder may allow you to group them by purpose. Here are some examples of common classifications: Collaborative File Sharing, Historical Archive, Media Server, Dump/Desktop Backup, Source Control Servers/Databases, Personal Storage, Product Distribution, and Application Servers.
Next Step... Migration
1) Get the highest level of approval you need before you start. Keep that signed document safe. You'll need it. 2) Identifying owners of the content and rather than having individual owners or team owners. On the servers you don't migrate try to roll up ownership to your divisions. As you look at this... take it a server at a time and break down the shares. Once you've identified your collaborative team shares and personal shares you can then work on a plan to migrate. Don't expect a 100% transfer and I highly recommend not having the IT Pro do the migration by him or herself. The teams will know their data best, and only the active collaboration should move over with the exception of historical HR/legal/compliance retention or corporate requirements... Whether it is 2% or 4% you'll be surprised about how little file sharing type data actually *needs* to be migrated. Co-existence for a short period of time is recommended and may be necessary. 3) Training will be the key enabler to making a good transition. 4) I highly recommend leveraging a communication plan with a very well communicated timeline. Over communicate, because you can't, expect someone to complain that they didn't hear. Use multiple means of communication, not just email. 5) The read only option on the file shares is an excellent feature to leverage. 6) Turning off a server is another good way of ensuring people isn’t using the shares. You may want to wait up to 3 months or more before decommissioning or repurposing the server.
There are many partners and ISVs as well as our own Microsoft Consulting Services in this space that can provide training, migration tools and resources for 2003 and 2007, analysis tools, PM resources, etc…
You can run multiple file server migrations in parallel once you understand the process, adoption, buyoff, training and all are aligned. Having these rouge systems under control will make you feel better, but moving data from one unmanaged deployment to another doesn't help your business. Ensure you are migrating to a SharePoint platform that has the right level of governance, control, and the management you need to ensure it is a positive experience and positive impact to the business. Please refer to the management and governance links. As some comments suggest, meta data capture and leveraging the document information panel can make the search and browse experience night and day for some content.
<UPDATE 1/4/06> Since I posted this I have noticed others have posted related content around the same time or even while I was writing this. A couple of comments... I didn't mention co-existence. Yes, you can index file shares with SharePoint (steps on 2007 with screenshots from SSA). That's a great idea for something you intend to simply stop using (read-only) and create fresh sites, index referenced archives, or for large files or for whatever reason you decide. The other recommendation from a mail I got was a web part that exposes the file system like a document library and even handled security on setup. I've looked at it and it wasn't far enough along for my taste. It's an interesting idea, but the security management is tough. I've also seen people create page viewer web parts to UNC paths, but I think the link to the UNC share path to the file or folder is a better end user reliable experience. I like adding the link content type to my doc libraries to avoid duplication when the document lives somewhere else. This would work well if the large or blocked file was on a file server. You should check out the powershell SharePoint on Codeplex, it may change the way you interface with the command line.
<update>
Windows File Services Resources:
Many ask the next question about Public Folders... are they dead? Visit my recent post on that topic.