In my two previous posts I described a potential performance hit caused by XSLT-to-MSIL compilation and JIT-compilation when you load and run some XSLT stylesheet with the XslCompiledTransform engine for the first time. Since the .NET Framework 2.0 did not allow you to save compiled stylesheets, you had to pay the compilation price on each application run.
XSLT Compiler Utility
The good news is we are providing the XSLT Compiler command-line utility xsltc.exe (announced here) that can be used to compile multiple stylesheets into one assembly. The changes to the System.Xml assembly required for this utility to work are shipped with .NET Framework 2.0 Service Pack 1, and the utility itself is shipped with Windows SDK 6.0, which absorbs .NET Framework SDK. Both these components will be installed by Visual Studio 2008. Below is the usage screen of xsltc.exe:
C:\>xsltc.exe /?
Microsoft (R) XSLT Compiler version 3.5
[Microsoft (R) .NET Framework version 2.0.50727]
Copyright (C) Microsoft Corporation. All rights reserved.
xsltc [options] [/class:<name>] <source file> [[/class:<name>] <source file>...]
XSLT Compiler Options
- OUTPUT FILES -
/out:<file> Specify name of binary output file (default: name of the first file)
/platform:<string> Limit which platforms this code can run on: x86, Itanium, x64, or anycpu,
which is the default
- CODE GENERATION -
/class:<name> Specify name of the class for compiled stylesheet (short form: /c)
/debug[+|-] Emit debugging information
/settings:<list> Specify security settings in the format (dtd|document|script)[+|-],...
Dtd enables DTDs in stylesheets, document enables document() function,
script enables <msxsl:script> element
- MISCELLANEOUS -
@<file> Insert command-line settings from a text file
/help Display this usage message (short form: /?)
/nologo Suppress compiler copyright message
The most useful options are /class and /out. If you have not specified the class name for some stylesheeet, it is defaulted to the name of the file containing that stylesheet, omitting the extension. The /debug option disables practically all optimizations (beware of performance degradation!) and creates a PDB file for the output assembly, which allow debugging stylesheets with a debugger. For security reasons, DTDs in stylesheets, the document XSLT function, and msxsl:script elements are disabled by default; you have to explicitly enable them using the /settings option if required. Each stylesheet is compiled into an abstract class, which can be loaded later by a new XslCompiledTransform.Load overload:
public void Load(Type compiledStylesheet);
Compiling stylesheets into an assembly both simplifies the deployment (you don't have to deploy multiple stylesheet files) and eliminates XSLT-to-MSIL compilation time. Moreover, you may also eliminate JIT-compilation time by installing the resulting assembly in the native image cache.
How to Use It
Let us take, for example, a couple of the DocBook stylesheets, which had the worst JIT-compilation time in my previous experiment, and compile them:
C:\docbook-xsl-1.72.0>xsltc /settings:dtd+,document+ /class:DocBookToHtml html\docbook.xsl /class:DocBookToFO fo\docbook.xsl
If you run the ILDASM tool on the resulting docbook.dll assembly, you will see two classes, DocBookToFO and DocBookToHtml generated for the stylesheets specified on the command line along with two helper $ArrayType$... classes used internally to initialize XSLT engine runtime tables:

Assembly with compiled DocBook stylesheets
To use compiled stylesheets from your favorite .NET language, you need to add a reference to docbook.dll to your project, and pass the desired class to the XslCompiledTransform.Load method. After that you may call Transform methods on the loaded XslCompiledTransform object the usual way:
XslCompiledTransform stylesheet = new XslCompiledTransform();
stylesheet.Load(typeof(DocBookToHtml));
stylesheet.Transform("input.xml", "output.html");
To improve startup time you may choose to "pre-JIT" the assembly, installing a native image for it in the native image cache. However, before that you probably want to change the preferred base address of the assembly to avoid rebasing (I recommend reading Improving Application Startup Time and NGen Revs Up Your Performance with Powerful New Features articles). The xsltc.exe utility does not support the /baseaddress option, but you may use either rebase.exe or editbin.exe tool, both of which come with Visual Studio®:
C:\docbook-xsl-1.72.0>editbin.exe /rebase:base=0x60000000 docbook.dll /nologo
C:\docbook-xsl-1.72.0>ngen install docbook.dll /nologo
Installing assembly C:\docbook-xsl-1.72.0\docbook.dll
Compiling 1 assembly:
Compiling assembly C:\docbook-xsl-1.72.0\docbook.dll ...
docbook, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null
You may ask why we decided to compile stylesheets to abstract classes instead of implementing some common interface similar to IXmlTransform from Mvp.Xml project. There were two main reasons. First, System.Xml is a "red" assembly, and changes in the red bits have been greatly limited in Orcas. We tried to make public API changes as minimal as possible. Second, implementing XSLT 2.0 in the next release of the .NET Framework will probably require us to change the interface anyway.
Script Assemblies
If the stylesheet contains msxsl:script elements, their content is compiled to one or more separate assemblies using the CodeDOM technology. Since the CodeDOM does not allow having code snippets in different languages in a single assembly, one script assembly per script language is created. Suppose, for example, that the stylesheet MyTransform.xsl contains C# and Visual Basic .NET script blocks. When you compile it, three assemblies will be created: MyTransform.dll, containing compiled XSLT code, MyTransform.Script.cs.dll, containing compiled C# script blocks, and MyTransform.Script.vb.dll, containing compiled Visual Basic .NET script blocks. You may merge script assemblies with the XSLT assembly using the ILMerge utility:
C:\MyTransform>ILMerge /out:MyTransform.dll MyTransform.dll MyTransform.Script.cs.dll MyTransform.Script.vb.dll
Limitations
Currently xsltc.exe does not allow to embed XML files as resources. Why might you need that? Suppose that the stylesheet C:\MyTransform\MyTransform.xsl contains relative document references document('') and document('config.xml'). If you compile it and deploy to another machine, it will try to read C:\MyTransform\MyTransform.xsl and C:\MyTransform\config.xml file respectively, which will result in an error unless you deploy MyTransform.xsl and config.xml in the same folder as on the build machine. You may think that relative document references should be resolved relative to the location of the compiled XSLT assembly, or that all documents referenced with relative URIs should be embedded in the assembly, but there are always cases when you need a different behavior. Fortunately, this problem may be resolved by modifying xsltc.exe to use a custom XmlResolver; I may write on this later.
Another limitation is that while XslCompiledTransform compiles a stylesheet to a set of unloadable DynamicMethods, an assembly generated by xsltc.exe cannot be unloaded until you shut down all AppDomains that used it (an infamous CLR limitation). This should not be a problem if you have a small set of fixed stylesheets, but becomes a real issue in server scenarios when thousand of stylesheets are generated dynamically based on user settings and customizations. We are actively investigating possible solutions for server scenarios, which do not require complicated AppDomain manipulations.
Under the Hood
Under the hood, xsltc.exe is a wrapper around the new XslCompiledTransform.CompileToType static method. You don't need to know about it unless you are developing your own version of the XSLT compiler. We expect that very few people will ever need to call this low-level method directly, as most will use xsltc.exe and optionally do some post-processing with other command-line utilities. However, for the sake of completeness, here is its brief description. (WARNING: The signature of the CompileToType method in beta releases of .NET Framework 2.0 SP1 may differ from the one given below.)
// Compiles an XSLT stylesheet to a System.Type
public static CompilerErrorCollection CompileToType(
XmlReader stylesheet,
XsltSettings settings,
XmlResolver stylesheetResolver,
bool debug,
TypeBuilder typeBuilder,
string scriptAssemblyPath);
Parameters...
stylesheet
The XmlReader positioned on the beginning of the stylesheet.
settings
The XsltSettings to apply to the stylesheet. If this is null, the XsltSettings.Default settings are applied.
stylesheetResolver
The XmlResolver used to resolve any stylesheet modules referenced in xsl:import and xsl:include elements. If this is null, external resources are not resolved.
debug
true to compile in debug mode; otherwise false. Setting this to true enables debugging the stylesheet with a debugger.
typeBuilder
The TypeBuilder to use for the stylesheet compilation.
scriptAssemblyPath
The base path for the assemblies generated for msxsl:script elements. If only one script assembly is generated, this parameter specifies the path for that assembly. In case of multiple script assemblies, a distinctive suffix will be appended to the file name to ensure uniqueness of assembly names.
Return Value
A CompilerErrorCollection object containing compiler errors and warnings that indicates the results of the compilation.
Note that the first three parameters are the same as in XslCompiledTransform.Load method. The xsltc.exe utility creates an AssemblyBuilder and a MethodBuilder, then for each stylesheet specified on the command line creates a TypeBuilder, and compiles the stylesheet into it using the CompileToType method. Compiler errors and warning returned from the CompileToType method are output to the console. If all stylesheets have been compiled successfully, the dynamic assembly is saved to disk. If you are new to Reflection.Emit, you may find this dynamic assembly sample code useful.
Conclusion
The xsltc.exe utility allows you to precompile XSLT stylesheets so that your application will not incur the performance penalty of XSLT-to-MSIL and JIT-compilation on the first stylesheet execution. It also makes deployment of complex XSLT solutions, consisting of dozens of files, less cumbersome and protects your source XSLT code. Multiple stylesheets may be compiled into a single assembly, and the resulting assembly may be merged with the main DLL or EXE file of your application using the ILMerge utility.
Update:
Transformation times for Saxon processors have been remeasured and updated based on the feedback received from Dimitre Novatchev and Michael Kay. I also slightly altered the text below to reflect the change in Saxon command-line arguments.
Interestingly enough, the first live.com hit for the "XslCompiledTransform Performance" query at the moment is this post of Jeff Prosise, where he says he was disappointed that XslCompiledTransform ran just 3 times faster than XslTransform on a "fairly simple style sheet". He is concerned that XslCompiledTransform is not fast enough comparing to the good old MSXML 4.0. Well, as we will see very soon, XslCompiledTransform may easily outperform MSXML 4.0 by several times!
Here I compare transformation speed of different widely-used XSLT processors for several arbitrary chosen stylesheets. I deliberately do not consider many other important aspects, such as working set, start-up time, compilation time, scalability issues, etc., focusing on pure transformation time only. I fairly tried to make all processors compete on equal terms; however I could miss some important details, especially for Saxon, which I know very little about. So this post should in no way be considered as a thorough comparison of XSLT processors; you are encouraged to run your scenarios with different processors and pick the one that fits your needs in the best way.
Let's first briefly describe our today's contestants:
- MSXML 3.0. The native XSLT processor implemented in MSXML 3.0 is still used by default in the Internet Explorer 6.0 and 7.0. It compiled a stylesheet to a tree of "actions", each of which knew how to "execute" itself. So it worked as a pretty simple XSLT interpreter.
- MSXML 4.0. The XSLT processor in MSXML 4.0 was completely reworked. It implemented a number of optimization techniques and compiled a stylesheet to some sort of P-code, which resulted in significantly faster transformation speed. This processor is more conformant and reliable than its MSXML 3.0 predecessor. Further versions of MSXML—5.0 and 6.0—bore the same XSLT processor as MSXML 4.0, so there is no much sense to consider them separately.
- XslTransform. The first managed XSLT processor,
XslTransform, was a port of MSXML 3.0 code. Unfortunately, in addition to bugs and performance issues ported from MSXML 3.0, some new ones were introduced during the porting process. XslTransform was good enough for many applications; however it was clear that its radical improvements were impossible without radical reworking of the code like the one happened between MSXML 3.0 and 4.0.
- XslCompiledTransform. The .NET Framework 2.0 presents a new managed XSLT processor,
XslCompiledTransform, which is going to replace the obsolete XslTransform class. XslCompiledTransform operates as a true compiler, translating a stylesheet into a set of dynamic MSIL methods, which use the highly-optimized XSLT runtime library. While compiled stylesheets run amazingly fast, incurred set-up costs—XSLT-to-MSIL compilation time plus JIT-compilation time—are considerably higher than for other XSLT processors, which may hinder its adoption in some applications.
- Saxon 6.5.5. Saxon is an open-source Java implementation of XSLT 1.0, developed by Michael Kay, a great XSLT enthusiast and the editor of the XSLT 2.0 specification. Version 6.5.5 was the last XSLT 1.0 processor release. As you can conclude from Michael Kay's "XSLT and XPath Optimization" article, Saxon uses basically the same approach as the XSLT processor in MSXML 3.0 together with some optimization techniques.
- Saxon 8.7.3. The latest version of Saxon, Saxon 8.7.3, implements the recent candidate recommendations for XSLT 2.0, XQuery 1.0 and XPath 2.0, and provides better integration with the .NET platform. Though it is an XSLT 2.0 processor, it is interesting to know how fast it can execute XSLT 1.0 stylesheets.
To run tests with MSXML I used the Msxsl.exe command-line utility. I had to tweak its code a little, because the -t option for measuring load and transformation times failed to work on CPUs faster than 2 GHz. The utility was developed around 09/2000, and apparently some of Microsoft developers did not realize how fast processors would become in 6 years! More precisely, this part of the Timer class constructor retrieves the frequency of the high resolution performance counter and rejects any value above INT_MAX = 2,147,483,647:
if (!::QueryPerformanceFrequency((LARGE_INTEGER *)&_freq) || _freq > INT_MAX)
{
// Counter not available
_freq = 0;
}
Below are the command-line arguments I used with Msxsl and Saxon. The number after -u specifies the version of MSXML to use, -o nul redirects output to the NUL device, so that file input/output operations affect our measurements in a minimal way. The undocumented -9 option forces Saxon to repeat the transform 9 times in a row, so that we obtain transformation time for the "warm" process. Unfortunately, the Msxsl utility does not provide a similar option, so for now MSXML 3.0/4.0 will be a little discriminated against. Both Saxon processors were run under Java™ 2 Runtime Environment version 1.4.2.
C:\XsltPerf>msxsl.exe -t -o nul -u 3.0 Kasparov-Karpov.xml chess.xsl
C:\XsltPerf>msxsl.exe -t -o nul -u 4.0 Kasparov-Karpov.xml chess.xsl
C:\XsltPerf>java -jar saxon6.5.5\saxon.jar -t -o nul -9 Kasparov-Karpov.xml chess.xsl
C:\XsltPerf>java -jar saxon8.7.3\saxon8.jar -t -o nul -9 Kasparov-Karpov.xml chess.xsl
Finally, for XslTransform and XslCompiledTransform I used the XsltPerf utility, presented in my previous post. The System.Data.SqlXml assembly was NGen'd, though I doubt it could considerably affect performance in the "warm" case. As a separate step, I verified that all processors produce the correct output.
For the first test, let's try the Queens stylesheet I used in the previous post. To not force you to read it, I recall here that this XSLTMark benchmark stylesheet, developed by Oren Ben-Kiki, finds all the possible solutions to the problem of placing N queens on an N×N chess board without any queen attacking another. XSLTMark uses N = 6, and the issue I immediately encountered was that one run of this scenario was executed too fast to make measurements quite reliable. So I tweaked its input file, which originally looked as <BoardSize>6</BoardSize> to make the stylesheet solving the same problem 20 times:
<Root>
<BoardSize>6</BoardSize>
... 18 identical lines skipped here ...
<BoardSize>6</BoardSize>
</Root>
Below are results for my Intel® Xeon® 3GHz box. Since XslCompiledTransform performance is affected by JIT-compilation on first use, as I described in my previous post, I give execution times of the first Transform call for this processor in parentheses. For example, for this stylesheet the first Transform takes about 53 ms, and subsequent ones take about 34 ms.
| Queens |
XslTransform |
1480 ms |
MSXML 3.0 SP5 |
1380 ms |
Saxon 8.7.3J |
850 ms |
Saxon 6.5.5 |
550 ms |
MSXML 4.0 SP2 |
148 ms |
XslCompiledTransform |
34 ms (53 ms) |
As you can see, MSXML 4.0 and XslCompiledTransform are much faster than other processors on this test; moreover, the latter is about 4 times faster than the former. I would like to note that the Queens stylesheet is rather artificial—it is an implementation of the backtracking algorithm in the language mainly oriented to deal with XML transformations. While it cannot be considered a real-world scenario, XslCompiledTransform performs really good even in that area. And if, in the past, performance issues might force you to implement similar helper functions in a general-purpose programming language, like C# or JScript, and call them using embedded scripts or extension objects technologies, now there is a greater chance you can implement those functions in XSLT itself and still have good performance.
For the following tests we take a couple of Sarvega XSLT Benchmark stylesheets, which represent real-world XSLT transforms. The Chess-FO stylesheet, developed by Anton Dovgyallo from the Russian Academy of Sciences, reads the sequence of moves in a chess game and produces a set of chess board diagrams, representing every intermediate position as a graphical image in the XSL-FO format:

Kasparov–Karpov
1990 World Championship Game
Again, MSXML 4.0 and XslCompiledTransform are several times faster than other processors. And if the first transformation for XslCompiledTransform takes 2 times longer than for MSXML 4.0 due to JIT-compilation, subsequent ones are 4 times faster.
| Chess-FO |
MSXML 3.0 SP5 |
470 ms |
XslTransform |
380 ms |
Saxon 8.7.3J |
300 ms |
Saxon 6.5.5 |
290 ms |
MSXML 4.0 SP2 |
52 ms |
XslCompiledTransform |
13 ms (101 ms) |
The DocBook-XHTML stylesheet, developed by Norman Walsh, transforms documents written in the DocBook format to XHTML. The input document used in Sarvega XSLT Benchmark is rather small—under 100 KB—and produces dozens of messages during its transformation. I had to redirect those messages to a file to minimize influence of xsl:message instructions on transformation time.
DocBook-XHTML is a huge stylesheet with thousands of templates, global parameters and variables, and you can see how badly JIT-compilation affects the first stylesheet run in case of XslCompiledTransform: 1970 ms versus 60 ms for subsequent runs. It would be really nice to have the ability to pre-compile and "pre-JIT" stylesheets, so you would not pay this price again and again on each application run, but currently the .NET Framework 2.0 does not provide means for that.
| DocBook-XHTML |
MSXML 3.0 SP5 |
...
2800 ms |
XslTransform |
460 ms |
Saxon 6.5.5 |
280 ms |
Saxon 8.7.3J |
240 ms |
MSXML 4.0 SP2 |
140 ms |
XslCompiledTransform |
60 ms (1970 ms) |
One can make a couple of conclusions from the results above:
- XSLT compilers steal a march on XSLT interpreters. While MSXML 4.0 is not a true compiler, the P-code it generates is close to the machine code, which allows it to surpass by far pure XSLT interpreters.
XslCompiledTransform is a true XSLT compiler and may transform several times faster than MSXML 4.0. However, since the .NET Framework 2.0 currently does not allow you to save compiled stylesheets, you have to pay the compilation price on each application run.
Now it does not seem a coincidence that the last release of the Java platform, J2SE 5.0, replaced the Xalan interpreting processor with the XSLTC compiling processor as the default XSLT engine. And that Michael Kay, the creator of Saxon, is experimenting in the same direction. However, it is a very untrivial task to develop a compiler from an interpreter. As you remember, Microsoft had to discard the old interpreter code base and start from scratch twice—and their efforts led to creating swift and reliable XslCompiledTransform and MSXML 4.0 XSLT processors.
This post discusses:
- Why
XslCompiledTransform may be slower than XslTransform
- How to reduce start-up time if you use one of the managed XSLT processors
- Why it is important to cache loaded
XslCompiledTransform instances
The .NET Framework 2.0 provides a new System.Xml.Xsl.XslCompiledTransform XSLT processor class, which is intended to replace the obsoleted XslTransform class. One of the major differences between the two is that while the latter is an XSLT interpreter, the former is a real XSLT compiler, allowing significantly faster execution times. Does it mean XslCompiledTransform is always faster? Surprisingly, the answer is not that simple.
Let's write a simple test application that measures Load and Transform times for both XslTransform and XslCompiledTransform processors. Here is the most interesting part of the code, and the full source code is available in the attached file.
private void TestXslCompiledTransform() {
XslCompiledTransform xslt = null;
for (int i = 0; i < numberOfIterations; i++) {
Stopwatch stopwatch = Stopwatch.StartNew();
xslt = new XslCompiledTransform();
xslt.Load(xslFile);
stopwatch.Stop();
Console.WriteLine("Load time: {0} ms", FormatTime(stopwatch));
}
Console.WriteLine("------------------------");
XPathDocument doc = new XPathDocument(xmlFile);
for (int i = 0; i < numberOfIterations; i++) {
Stopwatch stopwatch = Stopwatch.StartNew();
xslt.Transform(doc, (XsltArgumentList)null, XmlWriter.Create(TextWriter.Null, xslt.OutputSettings));
stopwatch.Stop();
Console.WriteLine("Transform time: {0} ms", FormatTime(stopwatch));
}
}
Note that both Load and Transform are executed multiple times in a loop, and their times are measured separately. Also we pre-load the input document and output results of the transformation to TextWriter.Null, so that file input/output operations are not taken into account. (I accidentally forgot to pre-load the stylesheet in memory, however that did not make a noticeable difference on the results obtained below, because I ran XsltPerf with the same stylesheet multiple times, and the stylesheet file was sitting in the disk cache after the first run.) If you are new to the XslCompiledTransform class and wondering what xslt.OutputSettings is doing in this snippet, you may find the answer in Erik Saltwell's post "What the heck is OutputSettings".
The application allows you to specify filenames of the input document and the XSLT stylesheet, the number of iterations, and which XSLT processor to use:
C:\XsltPerf>XsltPerf.exe /?
XSLT Load & Transform Performance Test Utility
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
XsltPerf [/xt | /xct] [/i:<n>] <xml-file> <xsl-file>
Options:
/xt Use XslTransform
/xct Use XslCompiledTransform (default)
/i:<n> Iterate n times (default is 5)
For testing purposes, we take one of XSLTMark benchmark stylesheets, namely queens.xsl, which finds all the possible solutions to the problem of placing N queens on an N×N chess board without any queen attacking another. XSLTMark uses N = 6, and so will we. Let's run the XsltPerf utility several times before taking readings to ensure the measurements take place on a "warm" machine:
C:\XsltPerf>XsltPerf.exe /xt queens.xml queens.xsl
The results for my Intel® Xeon® 3GHz box shows that the very first Load in a process is very slow for XslTransform and incredibly slow in case of XslCompiledTransform. The very first Transform is also significantly slower than subsequent ones. If you sum up the time for the first Load and the time for the first Transform, you'll get 239 ms for XslTransform versus 928 ms for XslCompiledTransform. In this particular scenario the new XslCompiledTransform class is almost 4 times slower!
| XslTransform |
XslCompiledTransform |
Load time: 90.69 ms Load time: 2.847 ms Load time: 1.841 ms Load time: 2.681 ms Load time: 1.891 ms
Transform time: 148.0 ms Transform time: 76.57 ms Transform time: 73.82 ms Transform time: 74.49 ms Transform time: 74.22 ms
|
Load time: 882.2 ms Load time: 4.582 ms Load time: 3.060 ms Load time: 3.119 ms Load time: 3.072 ms
Transform time: 45.31 ms Transform time: 2.027 ms Transform time: 2.027 ms Transform time: 1.962 ms Transform time: 1.977 ms
|
So what the hell happens on the first call? In the .NET Framework 2.0 implementations of XslTransform and XslCompiledTransform were moved to a helper System.Data.SqlXml assembly in order to reduce the size of System.Xml. And apparently that helper assembly is not NGen'd by default, so its methods are JIT-compiled on first use:
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727>ngen display System.Data.SqlXml
Microsoft (R) CLR Native Image Generator - Version 2.0.50727.42
Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.
Error: The specified assembly is not installed.
(If you don't know what NGen or JIT is, I highly recommend reading the "NGen Revs Up Your Performance with Powerful New Features" article.) JIT-compilation affects the first XslCompiledTransform.Load call much more significantly comparing to XslTransform.Load, because the compiler uses substantially more complex code than the interpreter.
Considering that other key .NET Framework assemblies are NGen'd, missing of System.Data.SqlXml in the native image cache may be a simple oversight on the part of .NET Framework installer. Let's generate a native image for the System.Data.SqlXml assembly:
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727>ngen install "System.Data.SqlXml, Version=2.0.0.0,
Culture=neutral, PublicKeyToken=b77a5c561934e089" /nologo
Installing assembly System.Data.SqlXml, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
Compiling 1 assembly:
Compiling assembly System.Data.SqlXml, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089 ...
System.Data.SqlXml, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
and run our test application again. I have copied the previous results to make it's easier for you to compare:
| XslTransform |
XslCompiledTransform |
| System.Data.SqlXml is not NGen'd |
Load time: 90.69 ms Load time: 2.847 ms Load time: 1.841 ms Load time: 2.681 ms Load time: 1.891 ms
Transform time: 148.0 ms Transform time: 76.57 ms Transform time: 73.82 ms Transform time: 74.49 ms Transform time: 74.22 ms
|
Load time: 882.2 ms Load time: 4.582 ms Load time: 3.060 ms Load time: 3.119 ms Load time: 3.072 ms
Transform time: 45.31 ms Transform time: 2.027 ms Transform time: 2.027 ms Transform time: 1.962 ms Transform time: 1.977 ms
|
| System.Data.SqlXml is NGen'd |
Load time: 19.27 ms Load time: 2.894 ms Load time: 1.895 ms Load time: 2.753 ms Load time: 1.911 ms
Transform time: 77.22 ms Transform time: 75.93 ms Transform time: 73.56 ms Transform time: 74.49 ms Transform time: 73.78 ms
|
Load time: 58.57 ms Load time: 4.862 ms Load time: 3.238 ms Load time: 3.280 ms Load time: 3.272 ms
Transform time: 15.23 ms Transform time: 1.918 ms Transform time: 1.921 ms Transform time: 1.949 ms Transform time: 1.926 ms
|
As you can see, the first Load call became much faster, though it still consumes some extra time to load the helper assembly into process memory and initialize all needed classes. Now XslCompiledTransform.Load is "only" 3 times slower than XslTransform.Load; however, this is expected — compiling, in general, is more expensive than the interpreter preparation work. It is the price you have to pay for faster execution: XslCompiledTransform.Transform, except the first call, is about 40 times faster!
If you look at new Transform times, you may note that the difference between the first and subsequent calls to XslTransform.Transform, thanks to NGen'ing, almost disappeared. But why is the first call to XslCompiledTransform.Transform still 8 times slower?! Here we need to recall that XslCompiledTransform compiles the stylesheet to MSIL methods. All those generated methods are subject to JIT-compiling on first use. While JIT-compilation is relatively expensive, its cost is usually amortized over several executions of compiled methods. For example, in our scenario the first XslCompiledTransform.Transform call is still faster than XslTransform.Transform one. However, in case of very simple stylesheets like <xsl:template match="/"><foo/></xsl:template> the first XslCompiledTransform.Transform call may perform several times worse.
I'd like to emphasize the principal difference between the "first Load" issue (relates to both XslTransform and XslCompiledTransform) and the "first Transform" issue (relates to XslCompiledTransform only). In the former case, the first Load per process is affected. In the latter one, the first Transform per loaded stylesheet is affected. If you load a different stylesheet or even reload the same stylesheet into the same XslCompiledTransform instance, a new bunch of MSIL methods will be generated and JIT-compiled on their first use.
Or per AppDomain in case of multi-AppDomain applications. For more information on sharing native images across AppDomains, you may read this and this.
Let's make some conclusions of this little experiment:
- If you are using one of the .NET Framework 2.0 XSLT processors (
XslTransform or XslCompiledTransform), NGen'ing the System.Data.SqlXml assembly may improve the application start-up time. This does not relate to the .NET Framework 1.1, which has the XslTransform implementation NGen'd by default.
XslCompiledTransform.Load is slower than XslTransform.Load, though it's unlikely to become a concern.
XslCompiledTransform.Transform performance may degrade by several times due to JIT-compilation. JIT-compilation happens on first use of a stylesheet template, and in many cases it would be the very first XslCompiledTransform.Transform call for the given stylesheet loaded into an XslCompiledTransform instance that is affected the most.
- While
XslCompiledTransform is a best choice for the "one Load, many Transforms" scenario, it may be slow for the "one Load, one Transform" scenario, especially for very simple XML/XSLT files and when stylesheet templates are executed only few times. In those cases XslTransform may be faster.
- It is important, especially for server applications, to cache an
XslCompiledTransform instance if the same stylesheet is likely to be executed again.
Attachment(s): XsltPerf.zip