Hello everyone,

This is the first part of my mini-series on how to create a custom file classification module that can be run by the File Server Resource Manager which ships with Windows Server 2008 R2. (If you missed the other parts, you can return to the index by clicking here)

 

Introduction

What is the File Resource Manager?

Quoted from http://technet.microsoft.com/en-us/library/dd758765(WS.10).aspx

Most applications manage files based on the directory they are contained in. This leads to complicated file layouts that require much attention from administrators and lead to frustration for users.

To reduce the cost and risk associated with this type of data management, the File Classification Infrastructure uses a platform that allows administrators to classify files and apply policies based on that classification. The storage layout is unaffected by data management requirements and the organization can adapt more easily to a changing business and regulatory environment.

Files can be classified in a variety of ways - today, classification is commonly performed manually. The File Classification Infrastructure in Windows Server 2008 R2 allows organizations to convert these manual processes into automated policies. Administrators can specify file management policies, based on a file’s classification, and automatically apply corporate requirements for managing data, based on business value. They can easily modify the policies and use tools that support classification to manage their files.

 

To get a better overview on how FSRM works, I recommend having a look at the following PowerPoint presentation: http://ecn.channel9.msdn.com/o9/pdc09/ppt/SVR02.pptx

Here is the information important for us:

  • There are two ways to classify documents
    1. Get/Set Property API (this is what the customer used)
    2. File Classification Infrastructure Extensions

    Windows Server 2008 R2 already comes with two FCI extensions:

    1. Folder Classifier
      1. assigns properties according to file location
    2. Content Classifier
      1. assigns properties according to string / regex matching in the file's content

    They allow assigning static values only. This is a problem e.g. if you want to use the name of the file that you are classifying as the value of a property. Luckily, there is the possibility to implement your own extension.

    For that, first of all we have to implement the module itself. Classification modules are COM servers, so I created a class library in .NET using C#. (Windows Server 2008 R2 comes with .NET 3.5, but if you installed the .NET 4.0 Framework you can use that as well.)

    Once done, we need to register the library as a COM server and additionally tell the File Server Resource Manager about its existence.

     

    The Classification Flow

    To create a custom classification module you need to implement the IFsrmClassifierModuleImplementation interface. (See here for a description of the interface's members: http://msdn.microsoft.com/en-us/library/dd878721(VS.85).aspx)

    The call flow of the interface during the classification process is the following:

    1. UseRulesAndDefinitions - Called at the start of a classification session. (Later on there will be used only Guids to refer to rules. Should you need information about them at a later point, this is a good moment to cache it.)
    2. For each file:
      1. OnBeginFile - Here is where you are handed the property bag with information about the file that is currently being classified. (This is where you obtain dynamic information that you can use to set the properties' values.) It will be called once per file.
      2. For each rule:
        1. DoesPropertyValueApply - This will only be called when IFsrmClassifierModuleDefinition.NeedsExplicitValue is false. In this case you can only decide whether the static value that was set when the rule was created applies to this file or not.
        2. GetPropertyValueToApply - This will only be called when IFsrmClassifierModuleDefinition.NeedsExplicitValue is true. In this case you can return any value you would like to assign to the property.
      3. OnEndFile - This will be called once per file and allows you to execute clean-up code, should it be necessary.
    3. OnUnload - Called when the classification finished and the module is unloaded.

     

    In the next chapter I will show you a simple implementation of this interface.

    Cheers,

    Helge Mahrt