Editor’s note: Due to the Independence Day holiday in the US we'd like to pause
and celebrate one of our most popular guest posts from the past quarter. The
following is a re-post of guest post SharePoint Paul Olenick
Overview of the new eDiscovery features available in SharePoint 2013
As you have no doubt heard, or seen for yourself by now, SharePoint 2013 represents massive leaps forward in many areas of the platform. One such area that doesn’t get a whole lot of attention is eDiscovery. This article is meant to serve as an introduction and overview of SharePoint 2013's new eDiscovery capabilities.
If you work in the legal or eDiscovery space I don't have to tell you how critical it is to have powerful, accurate and efficient eDiscovery tools but for those of you who don't know what eDiscovery is I'll provide a brief overview.
eDiscovery, or electronic discovery, is the process of discovering (finding) electronically stored information that is relevant to legal matters such as litigation, audits and investigations. Though it is called eDiscovery, the process typically entails more than just the discovery. The main stages of the process are roughly:
1. Discovery – Find the relevant content
2. Preservation – Place content on legal hold to prevent data destruction
3. Collection – Collect and send relevant content to be processed
4. Processing – Prepare files to be loaded into a document review platform
5. Review – Attorneys determine which content will be provided to opposition
6. Production – Provide relevant content to opposition
The SharePoint 2013 eDiscovery functionality focusses on the first three stages.
It should also be pointed out that eDiscovey is an extraordinarily expensive process. The most expensive aspect being the fees associated with attorneys reviewing the electronic content. The better tools we have for identifying relevant content (and weeding out anything non-essential), the less content will be presented to the lawyers resulting in cost savings. So those fist three steps that SharePoint will be involved in are crucial to get right and can represent massive savings.
Now those of you who have worked with SharePoint for a long time may know that SharePoint already included eDiscovery functionality.
While there were some basic eDiscovery-related features in SharePoint 2007 (such as the ability to place records on hold) a more cohesive eDiscovery story didn’t begin to emerge until the release of SharePoint 2010. With SharePoint 2010, we now had a top-tier search engine (especially for those organizations that implemented FAST Search Server 2010 for SharePoint) to help discover content. Additionally, SharePoint 2010 introduced the concept of placing and managing site-level holds, a mechanism for automatically copying eDiscovery search results to a separate repository for review and an API to develop custom solutions against these features. For more information on how SharePoint 2010 supports eDiscovery, see the following article on TechNet. http://technet.microsoft.com/en-us/library/ff453933(office.14).aspx#How
There were, however, some major limitations in SharePoint 2010’s eDiscovery solution. For example, the features mostly only applied to SharePoint content. And if a hold was placed on a site, it prevented users from continuing to work with the content. This was especially problematic when conducting internal investigations as it would alert those being investigated to the fact that they were under scrutiny. As such, those utilizing these tools in SharePoint have been eager to see what improvements were made in SharePoint 2013 eDiscovery.
The SharePoint Server 2013 eDiscovery feature set can be broken into the following components and functional areas.
· eDiscovery site template
· In-place holds (or In-place Preservation)
· eDiscovery Export
· eDiscovery APIs
The eDiscovery site template provides a central location to create and manage eDiscovery cases. A case is a SharePoint site template, created as a sub web within the eDiscovery site collection, which supports the process of discovering content across the enterprise, placing legal holds on content, filtering content and exporting it for delivery. The case site template also includes lists and libraries for collaborating on cases and keeping supporting content organized and centrally located.
An in-place hold is a mechanism for placing content (SharePoint 2013 documents, list items, pages, and Exchange Server 2013 mailboxes) on legal hold while allowing users to continue working with the content and without them being made aware of the hold. If a user edits or deletes content that has been placed on in-place hold, the content is automatically moved to a special location thus preserving the state of the content as it was at the time the hold was placed. This design decision, to only replicate data when a change has been made, limits the amount of storage needed to preserve content in its original state.
In-place holds can be placed either at the site or mailbox level, or alternatively, you can use query-based preservation. With query-based preservation, you can define eDiscovery search queries and only content that matches your query will be preserved.
SharePoint 2013 enables eDiscovery users to export the results of eDiscovery search queries so that they can then be sent for review. The export feature is capable of exporting documents (including versions for SharePoint content), list items and pages as well as exchange objects. The export tool also generates reports about the content, logs describing the export and an XML manifest which describes the exported content (including its metadata) in a format that complies with the Electronic Discovery Reference Model (ERDM).
There are a number of APIs available in SharePoint 2013 that enable customers to develop custom solutions that leverage eDiscovery functionality. I won’t go into any level of detail around programmability in this article, but suffice to say there is a model in place to create custom eDiscovery solutions. For more about SharePoint 2013 eDiscovery programming models visit the following link. http://msdn.microsoft.com/en-us/library/jj163267.aspx#SP15_eDiscoveryInSP_eDiscoveryProgrammingModel
To better illustrate how these tools and features can be used, I’m going to walk you through a typical case lifecycle from the standpoint of an eDiscovery user.
The high-level steps involved are to create a case, place legal holds, refine and filter content, export content and eventually release any holds and close the case.
For this walkthrough, I’m a member of the Litigation Support team at a company called (you guessed it) Contoso. The attorneys let me know that one of our former clients called Jamison is suing us and Contoso must present all relevant data we have to the opposition.
My first task is to create a new site for the Jamison case so I log into our SharePoint 2013 eDiscovery site. I log in using a special user ID that I only use for eDiscovery purposes. This is because in order to discover content across the enterprise, the user doing the searching must have access to everything. For obvious reasons, it is not a good idea to give a normal user account access to everything, so instead I have a separate account that I use just for eDiscovery.
When I first log into the site I see the eDiscovery Center template. This is where I go to manage existing and create new cases. On the default home page, Microsoft includes instructions on how to take advantage of the template.
After clicking “Create New Case”, I’m presented with a “New SharePoint Site” page where I can enter the name, description, URL and permissions for my new case site.
When the site has been created I’m presented with the new case site home page. The site is comprised of three sections.
1. The top section is used for finding and placing legal holds on content.
2. The bottom portion is used to refine and filter on the content until it is ready to be exported.
3. The left side of the page provides access to supporting lists and libraries for the case.
I’ll start by clicking “new item” in the eDiscovery Sets section to create an eDiscovery set. An eDiscovery set is comprised of a data source (a site, mailbox or other location), optionally a filter/query and the option of a legal hold. I add the URL of the Jamison project site in the sources area, provide a date range for the filter select “Enable In-Place Hold” and click “save.”
On the case home page, the In-Place Hold Status will indicate “Processing” for a time and eventually indicate “On Hold”.
When an in-place hold is set on a site, a special document library called the Preservation Hold Library is added to the site being preserved. After the hold is placed, if a user edits or deletes content in the site, a copy will be placed in the Preservation Hold Library. The hold also prevents anyone from deleting the site itself.
Now that the content is safely on legal hold I can begin the process of filtering it down to just the content that we are legally required to provide. Remember, the more content that is sent to be processed and reviewed, the more expensive our eDiscovery is going to cost so it’s important that we’re able to filter the content effectively. With that in mind, I navigate back to my case home page and click “new item” under Search and Export.
In the New Query Item page, I provide a name for my query and I have the opportunity to add search terms and filters. The Contoso lawyers and those of the opposition have agreed that only items regarding a particular deal number (809E5C95) are relevant and have agreed that the deal number will be the only query term. So I add my query term, click search and preview the items that are returned. I can mouse over the preview items to get more details and can also use the refiners on the left hand side to filter the content down more, but in this case we have exactly what we need already.
Next I click “Export” and am presented with a number of options related to the export. Most notable is that I am able to include all versions of SharePoint content in my export.
Finally, I am given the opportunity to download the actual content from my query or just reports on the contents. In this case I click “Download Results”. The download manager loads and allows me to choose a location for the export.
The download folder includes a number of files including an export summary, a manifest (which includes all items including their metadata in a standard format), reports and logs as well as the actual content.
Where there were multiple versions of a single item, the filenames of the older versions are appended to indicate the version.
Once the case is over, I go back to the case site, click the cog and select “Case Closure”. Closing the case will remove any remaining legal holds associated with the case and prevent anyone from adding additional holds to the case.
That’s a very basic walkthrough of how an organization may utilize SharePoint 2013 eDiscovery and you can see it accomplishes what it is designed to do. But it’s not all good news. As with any commercial software releases, there are going to be some gaps.
While in general I’m impressed with the eDiscovey story in SharePoint 2013, there are a few gaps to be aware of before investing in the technology.
First, in-place holds are only for SharePoint 2013 and Exchange 2013 content. But most customers are not on these new platforms yet so how you use SharePoint 2013 eDiscovery with content that resides in SharePoint 2010? The answer is that out of the box, you can do everything related to eDiscovery except place holds on that content. So we can search 2010 content from 2013, we can filter it down, export it with all of its versions and generate reports. We just can’t place holds.
Second, when a hold is placed on a site and a user edits a document that is being preserved the original version of the document (in its state at the time the hold was placed) gets copied into the Preservation Hold Library. However, if subsequent edits are made to the same document, those additional states of the document are not captured. These types of “continuous” or “rolling” holds are necessary for some customers so it’s important for them to understand this limitation.
Lastly, there is no way out of the box to search past versions of SharePoint content. This makes sense as it would be wildly confusing to see past versions of documents showing up in your normal search results, but would be incredibly useful (even necessary for some customers) for eDiscovery purposes.
Again, there is an API exposed for developing custom SharePoint 2013 eDiscovery solutions, so the platform can certainly be extended to fill these gaps. I have it on good authority that there are partners already looking to provide solutions to these limitations.
As for the version search, this would likely be solved with a custom search connector and here too SharePoint provides a rich framework for building custom connectors
Also, not really a gap or limitation, but something to be aware of is that the eDiscovery Download Manager requires .NET 4.5 on you client system.
So, to recap, SharePoint 2013 provides vast improvements in the eDiscovery story which includes a new eDiscovery site template, the ability to place in-place holds on SharePoint 2013 and Exchange 2013 content, an export feature to download reports and content (including versions for SharePoint content) and an API to develop custom eDiscovery solutions. And it all leverages SharePoint 2013 search which is truly a great enterprise search engine combining the best of SharePoint Search and FAST Search Server 2010 for SharePoint.
This really does represent a lot of investment and effort on Microsoft’s part and it shows. I would encourage anyone interested or involved with eDiscovery to evaluate the features. Just keep in mind the gaps mentioned above so that you’re going into it with eyes open and know, depending on your scenario, the overall solution may require some customization.
About the author
Paul Olenick (SharePoint MVP, MSFT V-TSP, MCT) is a Principal Consultant for Arcovis where he leads SharePoint and Enterprise Search engagements for large organizations across multiple verticals including legal, life sciences, financial, utilities, retail, non-profit, and more. Paul has been dedicated exclusively to SharePoint since 2006 and FAST Search Server 2010 for SharePoint since its beta release in 2009. He has helped dozens of clients solve business problems by leveraging SharePoint and Enterprise Search and shares his experiences with the greater community by speaking at events, contributing to books and blogging at http://olenicksharepoint.com. Follow him on Twitter.
About MVP Mondays
The MVP Monday Series is created by Melissa Travers. In this series we work to provide readers with a guest post from an MVP every Monday. Melissa is a Community Program Manager, formerly known as MVP Lead, for Messaging and Collaboration (Exchange, Lync, Office 365 and SharePoint) and Microsoft Dynamics in the US. She began her career at Microsoft as an Exchange Support Engineer and has been working with the technical community in some capacity for almost a decade. In her spare time she enjoys going to the gym, shopping for handbags, watching period and fantasy dramas, and spending time with her children and miniature Dachshund. Melissa lives in North Carolina and works out of the Microsoft Charlotte office.
This is very useful.Thanks Paul.
If anyone wants to look into SharePoint 2013 Search Architecture and Configuration.