SharePoint made its name with collaboration tools that integrates very well with other Microsoft products and is the foremost tools for user adoption. With SharePoint 2007, Web Content Management capabilities were integrated from CMS, and it was also the start of more Document Management and Record Management functionalities. With SharePoint 2010, it now scales to the demand of large enterprises, and brings together the value of social computing for these enterprises. After all, the most important asset to a company are its employees, and SharePoint is all about connecting people.
Historically, companies often had separate products for collaboration, documents, records, and web content. These tools may have integrated together, but the story was rarely great, especially during update times. ECM was made for augmenting the findability of documents – and it started with metadata + classification.
Enterprise-wide complete success in ECM is however somewhat mitigated. Simply put, a person typically enters some metadata (if any) when creating the document, and won’t enter a lot of them – it’s often the default values. The other typical challenge for classification was with asking the users which file plan number it was. While I’m greatly oversimplifying, even that isn’t done properly for a complete enterprise. Typical successes are for departments or small teams that typically handle records from the get-go, not for documents you iterate over and then become final.
It’s also not intuitive for all users to work on documents on their computer or file shares, and then manually bring it to a separate store meant for “final documents only”. In some situations, customers are using these stores for “published/official documentation”. This makes the problem much more obvious as it’s typical for some types of documents to have multiple published versions (i.e.: you update a document for a new project phase) – users forget to update the published store.
Even if considering only SharePoint 2010, the challenge comes first in understanding these workloads and how closely they are related. This series of post will attempt at consolidating information for a better guidance for these multi-workload scenarios, how it works with SharePoint, and how it works when integrating with other ECM products.
While we had some core elements of ECM in SharePoint 2007, we leaped ahead with SharePoint 2010. Features like the Managed Metadata Services with its Term Stores and Content Type Hubs makes it easy and efficient to control and share taxonomies; the content organizer and multi-stage retention capabilities are efficient; and the in-place records are a new and welcomed component for business collaboration spaces. All of this relies on a serious .NET and Workflow Foundation platform, but also with social components such as the Activity Feeds if necessary.
If you hadn’t thought of SharePoint in the past for ECM, and more so if you struggled with implementing ECM with other products, now’s the time to take a serious look at SharePoint 2010. Microsoft Consulting Services can definitely help you make the best use of your platform, enhance adoption, and reduce costs.
Let’s go ahead with some common definitions.
The Association for Information and Image Management [AIIM] currently defines ECM as :
“Enterprise Content Management (ECM) is the strategies, methods and tools used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. ECM tools and strategies allow the management of an organization's unstructured information, wherever that information exists.”
According to Wikipedia, it mentions that this statement was last revised in early 2010. I’d say they are rightfully revising these statements in order to include new ways for sharing information. Previously, ECM was defined as the technologies rather than the strategies.
In the field, part of the challenge is that companies are trying to purchase an ECM product and hope they’ll simply install it and content will be managed. As stated, ECM is not a product, it’s a strategy which may require a few product to fulfill depending on the choices made. That strategy must include the people factor, both from an adoption and usability perspectives. The old saying of “These 12 metadata are all required and everyone will fill them according to their file plan” didn’t work. It’s time to look at a strategy that will enter these information from its context with minimal user input.
Let me give you an example of contextualizing metadata for a document: you create a project workspace for a single specific project. This project will likely define most of the information that you would have asked for each documents. A strategy to ask this information a single time per project and automatically tag all relying documents from their first version is much more efficient. The same goes if you need to augment with information coming from the user’s context.
Another important aspect is that ECM covers all types of content, not just document. These can be in the form of imaging, physical records, notes, emails, voicemails, meetings, blogs/wikis/other web content, IMs, etc. I’ve seen many implementations where companies only included documents due to product limitations. While it’s a good start, make sure that it fulfill your ECM and eDiscovery requirements.
The AIIM currently defines Document Management as :
“Document management, often referred to as Document Management Systems (DMS), is the use of a computer system and software to store, manage and track electronic documents and electronic images of paper based information captured through the use of a document scanner.”
Typical features of Document Management are :
Historically, the technology was made in such a way so that DM systems were an hierarchical tree based structure of folders and documents. In my experience, these systems were mostly implemented within specific departments such as Legal and HR who had a process in place for documents. They also typically didn’t go for collaboration or social. There will likely always be some types of documents that won’t require social or collaboration, but as we find new ways to share and participate, these types of documents will lower quickly.
What’s important here is that there should be multiple DM repositories in any enterprise. Some will be collaborative, some won’t. Your Solution Architecture will determine the repositories you will have such as collaborative, financial statements, audio/video archives, and web content (these are merely examples).
The AIIM currently defines Collaboration as :
“Collaboration is a working practice whereby individuals work together to a common purpose to achieve business benefit.”
I think this statement is vague on purpose. Collaboration assists users in working together on content and deliverables. The final deliverable is typically either web content or a document, but the collaboration features also captures the other shared information during the process. The key thing here is that Collaboration is scenario focused. Rolling out SharePoint for blank Team Site is typically not responding to a scenario. Look for examples like these:
As you can see from these samples, Documentation Management is really a part of collaboration. After all, checkout/check-in, version control, metadata, and workflows are all form of collaboration – BUT so are others. This is important when you consider integrating other ECM products as the way they do this integration will impact your collaboration and social capabilities – or it will affect the user experience, and thus, adoption and employee retention.
Social can be viewed as an extension of collaboration functionalities where we move towards openness and freedom for users to author content. Quoting the AIIM again, social software should have :
…and…
Personally, I’m buying this as I can see where content is more up to date when it’s freeform. Historically, with documents, we have to rely on the few individuals who were making these documents and they tend to be too busy. While it’s open, there are still controls in place before approving an update, but this allows for people who are expert in a specific section of a document to contribute there. It also allows people who identifies an error (such as something that isn’t true anymore) to complement the information. When working with documents only, you either don’t have the ability to change this, or you’d have to go through the author who may not be there anymore or who may be too busy to update the documentation.
SharePoint Social capabilities are ultimately the My Sites coupled with the Activity Feeds and Tags. Think of the Activity Feeds as a mix of Facebook wall and Twitter – with a business purpose. The core framework is very solid. You can look at NewsGator’s Social Sites for SharePoint 2010 to see what you can do with the out of the box framework but with a much improved GUI. They merge the activity feed with community sites and allow for asking questions to experts – and so much more.
In content management, Social can also take the form of your company’s information on networks such as Facebook and Twitter – you may (probably) need to be there, but that content also needs preparation and be recorded. Your ECM strategy will eventually have to plan this ahead and define a way to both be legally compliant and also fluid enough so that you can respond fast on these external networks.
The AIIM currently defines Records Management as :
“Enables an enterprise to assign a specific life cycle to individual pieces of corporate information from creation, receipt, maintenance, and use to the ultimate disposition of records. A record is not necessarily the same as a document. All documents are potential records, but not vice versa. A record is essential for the business; documents are containers of "working information." Records are documents with evidentiary value.”
Historically, Document and Record Management have been done together since the hierarchical tree based structure of folders and documents was a convenient way to apply records directly. Products were made as such and you can get a 2 in 1 easily. However, User Adoption and proper metadata entries have been a big challenge for these solutions.
What I particularly like about the definition is that “Records are documents with evidentiary value”. I’ve often been confronted with customers that believe that all documents are records. When you engage in the discussion and show examples of documents that are likely not a potential record (i.e.: draft documents, the document you create for some office event like Christmas, etc.), they’ll change to all ‘enterprise documents’ with or without a clear indication on how/when a document becomes “enterprise”. Things can turn in circles for quite some time.
In the end, a record starts with a human intervention. You define it as a final document by setting a property/metadata, by right-clicking on it, etc. You might be able to automate to some level, like automatically making all last published versions of documents in a project workspace as records when you close the project. The action of closing the project is likely manual as well, but it could be automated through the project end date. Last, you may have repositories that only contains records and thus, the manual action is that the user is navigating there to drop the document (he likely created/collaborated on the document elsewhere) – unless it’s an automated system that put the document there.
On a more practical level, you would define the document templates for a project workspace and get elements such as :
By planning ahead, you can design your workspaces so that it’s automated for users. When going in for a meeting’s workspace, if they create a new document, it’s automatically a Meeting Minute document and will inherit metadata from the site and the meeting (after all, you are creating the meeting for specific reasons which will likely translate to some metadata for the meeting minutes). All of the official document types are readily available in the Project Documentation library. Site Owners may add other document libraries for their collaboration requirements, and these will fall under the “Unspecified document types catch all”.
The nice thing about this quick sample is that all social capabilities are enabled when interacting with information and documents at the project level. This is where it has values from a people’s perspective. When the official documentation becomes a record, it stays in the workspace and shouldn’t be moved right away – this helps on the user side as they are still working on the project and want a single place for all of the project’s documentation. The project’s collaboration workspace will eventually close out. It’ll likely become read-only when the project ends, and may simply be deleted a few years after the project end. All official documents will have been moved at the archive before the site’s deletion.
Records Management isn’t an IT project. It sounds silly, but I’ve seen it. While IT facilitates the technology piece, and may have some sort of architect role close to information management, but it needs to be business driven with the legal staff. To make it possible, you’ll probably have a general strategy and then break down in multiple small projects to define the policies. The key factor being smaller projects – trying to do a whole large enterprise in a single shot doesn’t work. While IT will still drive the general technology platform, each business units must drive their own information requirements (with possibly the help of architects coming from the IT department).
Records Management has always been about policies, but one of its big requirements in the past was for findability, or rather the ability to find back published documentation. Search engines have evolved in such a way that findability is not as much a requirement for Records Management, but eDiscovery is now a major driver for RM, and that is our next subject.
Citing the AIIM again, eDiscovery is defined as :
Discovery is the term used for the initial phase of litigation where the parties in a dispute are required to provide each other relevant information and records, along with all other evidence related to the case.
As such, eDiscovery is for all electronic information, not just documents. As I mentioned in the ECM section, it can take many forms such as imaging, physical records, notes, emails, voicemails, meetings, blogs/wikis/other web content, IMs, etc.. This is another reason why trying to take a document-only strategy will not lead to success.
At this level, SharePoint 2010 is designed to support compliance but may or may not be your main tool for eDiscovery. This will depend on your email/IM archival strategy (amongst other things). You do have the ability to store the archives of these systems closely with SharePoint so that you can lead eDiscovery with SharePoint, but you should define your strategy first. How do your legal council work in these events? should they hold the records directly and/or should they copy what they found in a specialized team site for the litigation? How do they identify records? How do they audit and report back the actions on records?
This article provides starting information on Records & eDiscovery for SharePoint: http://msdn.microsoft.com/en-us/library/ee557329.aspx.
When SharePoint 2007 came out, it had merged the ‘old’ Microsoft Content Management Server (MCMS) in the platform and it was then called the Publishing features. Typical SharePoint environments in the past have been mostly using either Publishing or Collaboration features, and often in separate environments even for Intranets. SharePoint 2010 democratized the Publishing features by making its use much more convenient – it’s approach is close to that of Wikis. Let me break down what the Publishing features contain:
If you go back to my “Document Management explained” section, you will notice that the first few items are exactly the same but refers to Web content rather than Document. That’s because the only difference is the format presented to the user and that’s all it should be for the most part.
Since publishing features are just like documents but in web format, plan ahead! Ask yourself how your information should be consumed! If all the users are going to use these on a web portal, don’t go creating documents just to transform them – it’s more complex and doesn’t add value. With SharePoint 2010, you can make this web content as records just as much as documents.
And looking at social features, they all have a web content piece. The new thing is that they are also integrated with our devices such as smartphones. I can definitely see a future for an enterprise Twitter-like capability that could be used from your business productivity suite (Office), the web, your IM (Lync), and SharePoint. Publishing merged with social features definitely still has a future!
Web Content can also be in other format such as blogs, wikis, or even a simple note board – at some level, they are all essentially a textbox that allows you to enter content. This can lead to some interesting discussions with large enterprises that haven’t explored Web 2.0. I’ve seen customers that added blogs in a hurry because “it’s popular and we have to be social”, but weren’t accepting comments!
In short, you should have a strategy on how blogs & wikis are used and presented, not really on how to block it. They are a format, not a productivity enemy, and in fact, they are a productivity helper. In the past years, some more conservative companies were adamant about blocking blogs & wikis because employees shouldn’t write about personal stuff. The interesting part is that those employees are intelligent enough to write personal views on company policies or products in their company blog environment; if they want to share their week-end, they’ll go on Facebook, they won’t do it on the company blog. The 2 primary reasons for that is that it’s identifiable and also because people want to write so that they can either be recognized or can make things better.
SharePoint 2007’s features were superb and came out at a great timing. Most document management systems were used for either specific departments or where used as the archive where employees would have to manually copy their files there when they deemed it final. This left a vortex of files stored massively on file shares which offer limited (any?) features. These files are unmanaged and have been piling in without retention forever.
With SharePoint’s ability to easily create workspaces as needed, it was a slam dunk. Integration with the Office client was built-in without a separate deployment was a big plus. And making all of this available through a web interface made it convenient. Traditional content management systems were sure to try to point out where SharePoint was missing features, or how theirs was better, but the reality is that SharePoint is a more than a good-enough all-around platform.
SharePoint’s integration is top-notch and is a key figure with Microsoft infrastructure. Active Directory? check. Exchange? check. ForeFront Security? check. Configuration Manager? check. Monitoring? Check. Data Protection? check. Lync? check. Office? check.. ForeFront Identity Management? check. Beyond that, they make AD, Exchange, Office, and Lync more relevant! All companies struggle with maintaining employee information (the one where the employee should update) – SharePoint does it and talks back to the AD store. You need to manage distribution lists for communities? plug in with Exchange and ForeFront Identity Manager and you got it on all front. You collaborate and it goes to your activity feed which can be viewed through Lync or Outlook!
This is definitely a better together story. SharePoint connects people, not just information. That’s where it won in user adoption, and that’s where the competition has been lacking. SharePoint is also an all-around platform, not just for a niche. Can you find a better product in any particular aspect for SharePoint? probably and you can go ahead and debate what is still missing anyway. The reality is that products shouldn’t be thought in silos anymore. Integration costs are staggering and rarely offer what you want.
The last ECM magic quadrant from Gartner (http://www.gartner.com/technology/media-products/reprints/microsoft/vol14/article8/article8.html) shows how much Microsoft has evolved in this sector:
Ahh the famous 100GB limit or 2,000 list limit. Newsflash: these were hard-coded limits, or break points. While they were recommended software boundaries or supported and tested limits, they should be thought in context. First off, the 100 GB limit, it’s a recommendation per content database for 2007 due to recovery times. With 2010, the current limits are at 100GB per site collection, 200GB per content database, and 1TB limit for mostly-read scenarios. (i.e.: Record Center) If you need a single site with more data, consider using RBS and plan your disaster recovery accordingly.
2nd, the 2,000 limit. That’s now 5,000 by default. This isn’t the limit in a list/library, it’s per view! This is simply because you will start having performance issues at some point if you keep retrieving all your data. You wouldn’t see these issues with a traditional system simply because either you didn’t have enough data, or you were able to plan ahead and make sure you weren’t retrieving too many items. With SharePoint, the end-users have more flexibility and it comes with responsibilities. Unfortunately, some companies didn’t plan ahead and let their users do to much. This can be easily mitigated through rapid training and governance.
The actual physical limits are more difficult to discern because there are many facets to them. For example, you won’t design your repository the same way for a video archive, financial statements, or project documents – they don’t have the same size and thus a different limit will be in play. For file size, you can consider RBS, multiple data files in SQL, or even the Content Organizer to distribute the load across multiple site collections (it could even be a tenant). I’m confident that SharePoint can play in the 100s of TeraBytes of content – you just need a lot more planning for that size than for a WCM portal!
Like any system, SharePoint should be planned. Governance should be done and re-evaluated on a regular basis as new requirements are identified. Active monitoring should be done to assess the environment and identify upcoming pain points before they happen. Companies have to find the right mix between flexibility and control by providing capabilities to the end users, but also sufficient boundaries and training so that it won’t negatively affect other users.
SharePoint can definitely handle larger ECM implementations – but it requires proper planning and adequate resources, just as you would with any other software providing ECM capabilities. The challenge has been that SharePoint is easier for anyone to use, and often with little governance and monitoring, as such, problems are being identified late and are more difficult to resolve. Also, the Product Group has been very transparent with articles such as the Software Boundaries and Enterprise Content Storage Planning that describes the supported limits. ALL products have limits somewhere, Microsoft simply chose to write them in order to make it easier to design your solution. I’m confident that the Product Group will always look at ways to augment these supported limits.
If you are missing a small piece for a key organization requirement, there is likely a 3rd party that adds to SharePoint due to its rich platform. Competitors will argue that they have this built-in. To be honest, the reality of content management systems is that they are a party of merged companies. A lot of these acquisitions may have changed under a same branding umbrella but they aren’t fully integrated yet neither.
For example, look at OpenText, one of the strong ECM player, they had LiveLink which is now rebranded as OpenText ECM Suite, but also includes the other Hummingbird, Captaris, RedDot and Artesia in it. These were either competitors or add-on software that were merged in their ECM suite. When you want different segments of their ECM suites, the cost goes with it too. The challenge comes with making sense of the licensing and patching – it’s not easy.
That’s what I find interesting because 3rd party companies adding to SharePoint will adhere to best practices from the get go and will look just as much as being part of the product. They have to do this to stay relevant as the SharePoint platform evolves. When looking at augmenting SharePoint features, the buy vs build debate is important. There are some very good partners for things like workflows, social, governance reporting, metadata driven permissions, and many more. They all share one thing in common, they augment SharePoint through what its underlying platform offers – that is definitely more integrated than a product acquisition sharing the same name. It also shows how rich that underlying platform (.NET and Workflow Foundation) is.
And last, it also allows for these expert partners to update more frequently as necessary. Take the Social partner, NewsGator for example, they can make 10-12 releases between each SharePoint release. It’s relevant to them as the Social space changes rapidly (Twitter had 400k tweets per quarter 3.5 years ago, it had 65 million per day in June 2010 – back when SharePoint 2010 started planning, Twitter wasn’t much on the radar). SharePoint did create an awesome Activity Feed framework though, and NewsGator filled the GUI gap and will continue evolving it. The same goes for other things like RBS or Workflows.
I hope you found this article practical for defining the general concepts on ECM & SharePoint. SharePoint 2010 is a great product and will even be better in the ECM space with future updates and releases. It combines both the ease of use, adoption, and strong ECM capabilities – and it also brings a solid Microsoft (Office, Active Directory, Windows, etc.) and partner ecosystem.
The upcoming parts will discuss how to plan these multi-workload environments, how we integrate with other ECM products, and what additional value the Enterprise SharePoint offering gives you with no additional software cost.