Ever since I started dealing with filters, I've seen numerous questions regarding "What does the proper validation of an IFilter mean? What tests should we execute and how to excute them?" . Hence, its only appropriate that we publish a document detailing our rigorous test procedure so that everyone targeting components at MS Search products can benefit from it.
Disclaimer: The following list presents only a subset of the testing methodologies we apply at MS Search and are by no means meant to be a quick recipe for weeding out ALL security vulnerabilities in your filter.The list is meant to provide an overview of the issues one should think about while testing and implementing filters.
----------------------------------------------------------------------------------------------------------------
A. Architectural Considerations : - COMPLIANCE REQUIRED
Filter does not require client installation. Filter is free from dependencies during runtime. Filter is free from dependencies during compiletime. Filter dll is monolithic.1. The Filter DLL does not require the client to be installed on the indexing machine. 2. The Filter dll does not make references to other binaries during compile time. 3. The filter dll is monolithic, self- contained without any other external dependencies. For an overview of the problems caused by non-monolithic DLLs, please see: http://blogs.msdn.com/ifilter/archive/2006/11/20/breaking-the-monolithic-filter-dll.aspx
B.Threading Model: - COMPLIANCE REQUIRED
Filter supports "BOTH" threading model. Filter supports "Free" threading model.Filter threading model must be marked as either "BOTH" or "Free" under: HKEY_CLASSES_ROOT\CLSID\{GUID}\InprocServer32 HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\{GUID}\InprocServer32 We recommend using "BOTH" threading model.An object that is marked with a threading model of "Both" takes on the threading model of the thread that created the object. Marking the threading model as "Both" necessitates that the filter is threadsafe.
C.OS Versions Supported:
D. Backwards compatiability with SPS2003 :
Filter works with SPS 20031. Register filter dll with SPS 2003. 2. Create a content source with your documents, crawl and query.
E. Loading Mechanisms : - COMPLIANCE REQUIRED
Filter Supports IPersistStream Filter Supports IPersistStorage Filter Supports IPersistFileThe filter needs to support all three loading mechanisms for backward and forward compatiability reasons. We recommend trying to load via IPersistStream and fall back to IPersistStorage or IPersistFile only if IPersistStream is not supported. The IFilterExplorer can be used to check which loading mechanisms are supported: http://www.citeknet.com/Products/IFilters/IFilterExplorer/tabid/62/Default.aspx
F. Dedicated support for 64 bit platforms :
Dependency walker satisfied for 64 bit filter dll.For 64 bit platforms, there should be no dependency on 32 bit binaries, i.e., no WOWing applications. Run <Depends.exe> to check if dependencies are satisfied to prevent runtime errors. Known Issue: A dependency on MSJAVA.dll shows up in red in dependency walker. You can safely ignore this.
G. Code Coverage:
We recommend at least 70% code coverage. This can be easily profiled using VS 2005 Team System.
H. IFiltTst - Consistency, Legitimacy and Illegitimacy tests:
Consistency test with pass rate > 95% Legitimacy test with pass rate > 99% Illegitimacy test with pass rate > 90%IFiltst can be used to run the following test: Consistency Test: The chunks emitted by the filter should be consistent between two runs. Legitimacy Test: This test validates that the filter is initialized with proper config and getText() and getValue() are functioning as expected. Illegitimacy Test: In essence, this test tries to validate that the filter is well behaved by trying to exercise inappropriate configs during initialization and also by calling getText() on value type chunks and vice versa.
Details of using IFilttst can be found here: http://msdn2.microsoft.com/en-us/library/ms692580.aspx
I. Security tests with Fuzzing :
NOTE: The fuzzer is an internal tool. A list of external fuzzers is provided here: http://www.infosecinstitute.com/blog/2005/12/fuzzers-ultimate-list.html
Again, use these at your own risk:)
J. Performance Scaling:
80% scaling achieved with 2 Processors 80% scaling achieved with 3 Processors 80% scaling achieved with 4 ProcessorsOptimum usage of processors in a server environment is crucial for performance. The goal is to achieve 80% performace scaling with the addition of each new processor. Here's the test outline. 1. On a Quad proc machine, use ifilttst.exe with one thread to filter a large corpus of document and note down the time taken. -> Now use ifilttst.exe with two threads to filter the same corpus. The time taken should be (0.556 * TIME FOR FILTERING WITH ONE THREAD) -> With the addition of each subsequent thread, the new time T2 can be found with the formula: T2 = T1 * 1/[(1.8)^ (log2 N)] where N is the number of threads.
K. AppVerifier Tests :
L. Globalization:
Arabic Chinese Czech English French German Hindi Japanese Polish Spanish ThaiIf the document format facilitates marking the language / locale of contents (eg.MS Word), filtering of the documents marked with above languge tags must be verified. This is important as the the filter emits a locale information based on the language of the document, which is used by MSSearch to invoke the correct WordBreaker and Stemmer for the document.
M. Registry and File I/O:
No unnecessary File I/O No temp files created No independent registry I/O by filter.1.Use Filemon.exe with the filemon filter set to the name of your dll and verify that no file system I/O was initiated by IFilter other than the documents it is indexing. Take special note if the filter is creating temp files. 2. Use Regmon.exe to verify that no registry read/write operations are performed. www.sysinternals.com has both 32 and 64 bit versions of Filemon and Regmon.
N. Prefix/Prefast for Vista :
In Office team, the OACR checks for this if we build with windows Prefast requirements.However in other environments, we need to use the Visual Studio build configuration manager to enable Prefast error checking. More info( MS Employees): PREFIX internal website PREFAST: wrapped in OACR
WWW Resources:
http://msdn2.microsoft.com/en-us/library/ms933794.aspx
O. Calls to undocumented windows API :
No call to undocumented windows APIsRun APIScan to ensure we do not make any calls to undocumented windows API's.
Note: This requirement is solely for MS and MS partners to avoid situations like Secret API fiasco.
P. SAL annotation :
No SAL warnings - Logs providedSAL annotation is an excellent way to weed out potential security flaws in the code. More info at: http://msdn2.microsoft.com/en-us/library/ms235402(VS.80).aspx
Q. UI Popups :
No UI Pop-ups in filter.Use Filtdump to filter the document and ensure there are No UI Popups.
R. International Sufficiency:
We've seen a lot of issues in the past where Unicode / DBCS characters were not handled correctly by IFilters and Protocol Handlers. The problem is a bit more serious in Protocol Handlers as the address of the content source might be encrypted in a DBCS charset and the data retrieval fails.
S. Security Code Review:
This is the final line of defense against introducing security bugs in your code. DO NOT be skimpy on this!!! :)