• CarlosAg Blog

    IIS 7.0 Admin Pack: Request Filtering


    My last post talked about the Technical Preview release of the IIS 7.0 Admin Pack, and how it includes 7 new features that will help you manage your IIS 7.0.

    Today I was going to start writing about more details about each feature and Bill Staples just posted something (How to (un)block directories with IIS7 web.config) that almost seems that it was planned for me to introduce one of the features in the Admin Pack, namely Request Filtering UI.

    IIS 7.0 includes a feature called Request Filtering that provides additional capabilities to secure your web server, for example it will let you filter requests that are double escaped, or filter requests that are using certain HTTP Verbs, or even block requests to specific "folders", etc. I will not go into the details on this functionality, if you want to learn more about it you can see the Request Filtering articles over

    In his blog Bill mentions how you can easily configure Request Filtering using any text editor, such as notepad, and edit the web.config manually. That was required since we did not ship UI within IIS Manager for it due to time constraints and other things. But now as part of the Admin Pack we are releasing UI for managing the Request Filtering settings.

    Following what Bill just showed in his blog, this is the way you would do it using the new UI instead.

    1) Install IIS Admin Pack (Technical Preview)

    2) Launch IIS Manager

    3) Drill down using the Tree View to the site or application you want to change the settings for.

    4)  Enter into the new feature called Request Filtering inside the IIS category

    5) Select the Hidden Segments and choose "Add Hidden Segment" from the Task List on the right

    6) Add the item

    As you would expect the outcome is exactly as Bill explained in his blog, just an entry within you web.config, something like:

    <add segment="log" />

    So as you can see the Request Filtering UI will help you discover some of the nice security settings that IIS 7.0 has. The following images show some of the additional settings you can configure, such as Verbs, Headers, URL Sequences, URL Length, Quey String size, etc.

  • CarlosAg Blog

    Redirects, 301, 302 and IIS SEO Toolkit


    In the URL Rewrite forum somebody posted the question "are redirects bad for search engine optimization?". The answer is: not necessarily, Redirects are an important tool for Web sites and if used in the right context they actually are a required tool. But first a bit of background.

    What is a Redirect?

    A redirect in simple terms is a way for the server to indicate to a client (typically a browser) that a resource has moved and they do this by the use of an HTTP status code and a HTTP location header. There are different types of redirects but the most common ones used are:

    • 301 - Moved Permanently. This type of redirect signals that the resource has permanently moved and that any further attempts to access it should be directed to the location specified in the header
    • 302 - Redirect or Found. This type of redirect signals that the resource is temporarily located in a different location, but any further attempts to access the resource should still go to the same original location.

    Below is an example on the response sent from the server when requesting

    HTTP/1.1 302 Found
    Connection: Keep-Alive
    Content-Length: 161
    Content-Type: text/html; charset=utf-8
    Date: Wed, 10 Jun 2009 17:04:09 GMT
    Location: /sqlserver/2008/en/us/default.aspx
    Server: Microsoft-IIS/7.0
    X-Powered-By: ASP.NET


    So what do redirects mean for SEO?

    One of the most important factors in SEO is the concept called organic linking, in simple words it means that your page gets extra points for every link that external Web sites have linking to your page. So now imagine the Search Engine Bot is crawling an external Web site and finds a link pointing to your page ( and when it tries to visit your page it runs into a redirect to another location (say Now the Search Engine has to decide if it should add the original "some-page" into its index as well as if it should "add the extra points" to the new location or to the original location, or if it should just ignore it entirely. Well the answer is not that simple, but a simplification of it could be:

    • if you return a 301 (Permanent Redirect) you are telling the search engine that the resource moved to a new location permanently so that all further traffic should be directed to that location. This clearly means that the search engine should ignore the original location (some-page) and index the new location (somepage), and that it should add all the "extra points" to it, as well as any further references to the original location should now be "treated" as if it was the new one.
    • if you return a 302 (Temporary Redirect) the answer can depend on search engines, but its likely to decide to index the original location and ignore the new location at all (unless directly linked in other places) since its only temporary and it could at any given point stop redirecting and start serving the content from the original location. This of course makes it very ambiguous on how to deal with the "extra points" and likely will be added to the original location and not the new destination.


    Enter IIS SEO Toolkit

    IIS Search Optimization Toolkit has a couple of rules that look for different patterns related to Redirects. The Beta version includes the following:

    1. The redirection did not include a location header. Believe it or not there are a couple of applications out there that does not generate a location header which completely breaks the model of redirection. So if your application is one of them, it will let you know.
    2. The redirection response results in another redirection. In this case it detected that your page (A) is linking to another page (B) which caused a redirection to another page (C) which resulted in another redirection to yet another page (D). In this case it is trying to let you know that the number of redirects could significantly impact the SEO "bonus points" since the organic linking could be all broken by this jumping around and that you should consider just linking from (A) to (D) or whatever actual end page is supposed to be the final destination.
    3. The page contains unnecessary redirects. In this case it detected that your page (A) is linking to another page (B) in your Web site that resulted in a redirect to another page (C) within your Web site. Note that this is an informational rule, since there are valid scenarios where you would want this behavior, such as when tracking page impressions, or login pages, etc. but in many cases you do not need them since we detect that you own the three pages we are suggesting to look and see if it wouldn't be better to just change the markup in (A) to point directly to (C) and avoid the (B) redirection entirely.
    4. The page uses a refresh definition instead of using redirection. Finally related to redirection, IIS SEO will flag when it detects that the use of the refresh meta-tag is being used as a mean for causing a redirection. This is a practice that is not recommended since the use of this tag does not include any semantics for search engines on how to process the content and in many cases is actually consider to be a tactic to confuse search engines, but I won't go there.

    So how does it look like? In the image below I ran Site Analysis against a Web site and it found a few of these violations (2 and 3).


    Notice that when you double click the violations it will tell you the details as well as give you direct access to the related URL's so that you can look at the content and all the relevant information about them to make the decision. From that menu you can also look at which other pages are linking to the different pages involved as well as launch it in the browser if needed.


    Similarly with all the other violations it tries to explain the reason it is being flagged as well as recommended actions to follow for each of them.

    IIS Search Engine Optimization Toolkit can also help you find all the different types of redirects and the locations where they are being used in a very easy way, just select Content->Status Code Summary in the Dashboard view and you will see all the different HTTP Status codes received from your Web site. Notice in the image below how you can see the number of redirects (in this case 18 temporary redirects and 2 permanent redirects). You can also see how much content they accounted for, in this case about 2.5 kb (Note that I've seen Web sites generate a large amount of useless content in redirect traffic, speaking of spending in bandwidth). You can double click any of those rows and it will show you the details of the URL's that returned that and from there you can see who links to them, etc.


    So what should I do?

    1. Know your Web site. Run Site Analysis against your Web site and see all the different redirects that are happening.
    2. Try to minimize redirections. If possible with the knowledge gain on 1, make sure to look for places where you can update your content to reduce the number of redirects.
    3. Use the right redirect. Understand what is the intent of the redirection you are trying to do and make sure you are using the right semantics (is it permanent or temporary). Whenever possible prefer Permanent Redirects 301.
    4. Use URL Rewrite to easily configure them. URL Rewrite allows you to configure a set of rules using both regular expressions and wildcards that live along with your application (no-administrative privileges required) that can let you set the right redirection status code. A must for SEO. More on this on a future blog.


    So going back to the original question: "are redirects bad for Search Engine Optimization?". Not necessarily, they are an important tool used by Web application for many reasons such as:

    • Canonicalization. Ensure that users are accessing your site with www. or without www. use permanent redirects
    • Page impressions and analytics. Using temporary redirects to ensure that the original link is preserved and counters work as expected.
    • Content reorganization. Whether you are changing your host due to a brand change or just renaming a page, you should make sure to use permanent redirects to keep your page rankings.
    • etc

    Just make sure you don't abuse them by having redirects to redirects, unnecessary redirects, infinite loops, and use the right semantics.

  • CarlosAg Blog

    Calling Web Services from Silverlight using IIS 7.0 and ARR


    During this PDC I attended Ian's presentation about WPF and Silverlight where he demonstrated the high degree of compatibility that can be achieved between a WPF desktop application and a Silverlight application. One of the differences that he demonstrated was when your application consumed Web Services since Silverlight applications execute in a sandboxed environment they are not allowed to call random Web Services or issue HTTP requests to servers that are not the originating server, or a server that exposes a cross-domain manifest stating that it is allowed to be called by clients from that domain.

    Then he moved to show how you can work around this architectural difference by writing your own Web Service or HTTP end-point that basically gets the request from the client and using code on the server just calls the real Web Service. This way the client sees only the originating server and it allows the call to succeed, and the server can freely call the real Web Service. Funny enough while searching for a Quote Service I ran into an article from Dino Esposito in MSDN magazine  where he explains the same issue and also exposes a "Compatibility Layer" which again is just code (more than 40 lines of code) to act as proxy to call a Web Service (except he uses the JSON serializer to return the values).

    The obvious disadvantage is that this means you have to write code that only forwards the request and returns the response acting essentially as a proxy. Of course this can be very simple, but if the Web Service you are trying to call has any degree of complexity where custom types are being sent around, or if you actually need to consume several methods exposed by it, then it quickly becomes a big maintenance nightmare trying to keep them in sync when they change and having to do error handling properly, as well as dealing with differences when reporting network issues, soap exceptions, http exceptions, etc.

    So after looking at this, I immediately thought about ARR (Application Request Routing) which is a new extension for IIS 7.0 (see that you can download for free from IIS.NET for Windows 2008, that among many other things is capable of doing this kind of routing without writing a single line of code.

    This blog tries to show how easy it is to implement this using ARR. Here are the steps to try this: (below you can find the software required), note that if you are only interested in what is really new just go to 'Enter ARR' section below to see the configuration that fix the Web Service call.

    1. Create a new Silverlight Project (linked to an IIS Web Site)
      1. Launch Visual Web Developer from the Start Menu
      2. File->Open Web Site->Local IIS->Default Web Site. Click Open
      3. File->Add->New Project->Visual C#->Silverlight->Silverlight Application
      4. Name:SampleClient, Locaiton:c:\Demo,  Click OK
      5. On the "Add Silverlight Application" dialog choose the "Link this Silverlight control into an existing Web site", and choose the Web site in the combo box.
      6. This will add a SampleClientTestPage.html to your Web site which we will run to test the application.
    2. Find a Web Service to consume
      1. In my case I searched using for a Stock Quote Service which I found one at
    3. Back at our Silverlight project, add a Service Reference to the WSDL
      1. Select the SampleClient project in the Solution Explorer window
      2. Project->Add Service Reference and type in the Address and click Go
      3. Specify a friendly Namespace, in this case StockQuoteService
      4. Click OK
    4. Add a simple UI to call the Service
      1. In the Page.xaml editor type the following code inside the <UserControl></UserControl> tags:
      2.     <Grid x:Name="LayoutRoot" Background="Azure">
        <RowDefinition Height="30" />
        RowDefinition Height="*" />
        <ColumnDefinition Width="50" />
        ColumnDefinition Width="*" />
        ColumnDefinition Width="50" />
        <TextBlock Grid.Column="0" Grid.Row="0" Text="Symbol:" />
        TextBox Grid.Column="1" Grid.Row="0" x:Name="_symbolTextBox" />
        Button Grid.Column="2" Grid.Row="0" Content="Go!" Click="Button_Click" />
        ListBox Grid.Column="0" Grid.Row="1" x:Name="_resultsListBox" Grid.ColumnSpan="3"
        <StackPanel Orientation="Horizontal">
        <TextBlock Text="{Binding Path=Name}" FontWeight="Bold" Foreground="DarkBlue" />
        TextBlock Text=" = " />
        TextBlock Text="{Binding Path=Value}" />
      3. Right click the Button_Click text above and select the "Navigate to Event Handler" context menu.
      4. Enter the following code to call the Web Service
      5.     private void Button_Click(object sender, RoutedEventArgs e)
        var service = new StockQuoteService.StockQuoteSoapClient();
        service.GetQuoteCompleted += service_GetQuoteCompleted;
      6. Now, since we are going to use XLINQ to parse the result of the Web Service which is an XML then we need to add the reference to System.Xml.Linq by using the Project->Add Reference->System.Xml.Linq.
      7. Finally, add the following function to handle the result of the Web Service
      8.     void service_GetQuoteCompleted(object sender, StockQuoteService.GetQuoteCompletedEventArgs e)
        var el = System.Xml.Linq.XElement.Parse(e.Result);
        _resultsListBox.DataContext = el.Element("Stock").Elements();
    5. Compile the application. Build->Build Solution.
    6. At this point we are ready to test our application, to run it just navigate to http://localhost/SampleClientTestPage.html or simply select the SampleClientTestPage.html in the Solution Explorer and click View In Browser.
    7. Enter a Stock Symbol (say MSFT) and press Go!, Verify that it breaks. You will see a small "Error in page" with a Warning icon in the status bar. If you click that and select show details you will get a dialog with the following message:
    8. Message: Unhandled Error in Silverlight 2 Application An exception occurred during the operation, making the result invalid. 

    Enter Application Request Routing and IIS 7.0

    1. Ok, so now we are running into the cross-domain issue, and unfortunately we don't have a cross-domain here is where ARR can help us call the service without writing more code
    2. Modify the Web Service configuration to call a local Web Service instead
      1. Back in Visual Web Developer, open the file ServiceReferences.ClientConfig
      2. Modify the address="" to be instead address="http://localhost/stockquote.asmx", it should look like:
      3.     <client>
        <endpoint address="http://localhost/stockquote.asmx"
        ="basicHttpBinding" bindingConfiguration="StockQuoteSoap"
        ="StockQuoteService.StockQuoteSoap" name="StockQuoteSoap" />
    3. This will cause the client to call the Web Service in the same originating server, now we can configure ARR/URL Rewrite rule to route the Web Service requests to the original end-point
      1. Add a new Web.config to the http://localhost project (Add new item->Web.config)
      2. Add the following content:
      3. <?xml version="1.0" encoding="UTF-8"?>
        <rule name="Stock Quote Forward" stopProcessing="true">
        <match url="^stockquote.asmx$" />
        action type="Rewrite" url="" />
    4. This rule basically uses regular expression to match the requests for StockQuote.asmx and forwards them to the real Web Service.
    5. Compile everything by running Build->Rebuild Solution
    6. Back in your browser refresh the page to get the new, enter MSFT in the symbol and press Go!
    7. And Voila!!! everything works.


    One of the features offered by ARR is to provide proxy functionality to forward requests to another server. One of the scenarios where this functionality is useful is when using it from clients that cannot make calls directly to the real data, this includes Silverlight, Flash and AJAX applications. As shown in this blog, by just using a few lines of XML configuration you can enable clients to call services in other domains without having to write hundreds of lines of code for each method. It also means that I get the original data and that if the WSDL were to change I do not need to update any wrappers. Additionally if using REST based services you could use local caching in your server relying on Output Cache and increase the performance of your applications significantly (again with no code changes).

    Software used

    Here is the software I installed to do this sample(amazing that all of it is completely free):

    1. Install Visual Web Developer 2008 Express
    2. Install Silverlight Tools for Visual Studio 2008 SP 1
    3. Install Application Request Routing for IIS 7.
  • CarlosAg Blog

    Host your own Web Server in your application using IIS 7.0 Hostable Web Core


    IIS 7.0 includes a very cool feature that is not so well known called Hostable WebCore (HWC). This feature basically allows you to host the entire IIS functionality within your own process. This gives you the power to implement scenarios where you can customize entirely the functionality that you want "your Web Server" to expose, as well as control the lifetime of it without impacting any other application running on the site. This provides a very nice model for automating tests that need to run inside IIS in a more controlled environment. 

    This feature is implemented in a DLL called hwebcore.dll, that exports two simple methods:

    1. WebCoreActivate. This method allows you to start the server. It receives three arguments, out of which the most important one is applicationHostConfigPath that allows you to point it to your very own copy of ApplicationHost.config where you can customize the list of modules, the list of handlers and any other settings that you want your "in-proccess-IIS" to use. Just as ApplicationHost.config you can also specify the "root web.config" that you want your server to use, giving you the ability to completely run an isolated copy of IIS.
    2. WebCoreShutdown. This method basically stops the server listening.

    The real trick for this feature is to know exactly what you want to support and "craft" the IIS Server configuration needed for different workloads and scenarios, for example:

    1. Static Files Web Server - Supporting only static file downloads, good for HTML scenarios and other simple sites.
    2. Dynamic Web Sites
      1. ASPX Pages
      2. WCF
      3. Custom set of Http Modules and Handlers
    3. All of the above

    An interesting thing to mention is that the file passed to ApplicationHostConfigPath parameter is live, in the sense that if you change the configuration settings your "in-process-IIS" will pick up the changes and apply them as you would expect to. In fact even web.config's in the site content or folder directories will be live and you'll get the same behavior.


    To show how easy this can be done I wrote a small simple class to be able to run it easily from managed code. To consume this, you just have to do something like:

    internal class Program {
    private static void Main(string[] args) {
    int port = 54321;
    int siteId = 1;

    WebServer server = new WebServer(@"d:\Site", port, siteId);
    Console.WriteLine("Server Started!... Press Enter to Shutdown");


    Console.WriteLine("Shutting down");

    This will start your very own "copy" of IIS running in your own process, this means that you can control which features are available as well as the site and applications inside it without messing with the local state of the machine.

    A very interesting thing is that it will even run without administrator privileges, meaning any user in the machine can start this program and have a "web server" of their own, that they can recycle, start and stop at their own will. (Note that this non-administrative feature requires Vista SP1 or Windows Server 2008, and it only works if the binding will be a local binding, meaning no request from outside the machine).

    You can download the entire sample which includes two configurations: 1) one that runs only an anonymous static file web server that can only download HTML and other static files, and 2) one that is able to run ASP.NET pages as well.

     Download the entire sample source code (9 kb)

    You might be asking why would I even care to have my own IIS in my executable and not just use the real one? Well there are several scenarios for this:

    1. Probably one of the most useful, as mentioned above this actually allows non-administrators to be able to develop applications that they can debug, change configuration and pretty much do anything without interfering with the machine state.
    2. Another scenario might include something like a "Demo/Trial CD" where you can package your application in a CD/DVD that users then can insert in their machine and suddenly get a running/live demo of your Web Application without requiring them to install anything or define new applications in their real Web Server.
    3. Test Driven Development. Testing in the real Web Server tends to interfere with the machine state which is by definition something you don't want in your test environments, ideally you want your tests to be done in an isolated environment that is fully under control and that you will not have to do any manual configuration. This makes this feature an ideal candidate for such scenario where you own the configuration and can "hard-code" it as part of your automated tests. No more code for "preparing the server and site", everything starts pre-configured.
    4. Build your own service. You can build your own service and use Hostable WebCore as a simple yet powerful alternative to things like HttpListener, where you will be able to execute Managed and Native code Http Modules and Handlers easily without you having to do any custom hosting for ASP.NET infrastructure.
    5. Have your own Development Web Server where you can have advance interaction between both the client and the server and trace and provide live debugging information.
    6. many, many more...

    In future posts I intent to share more samples that showcase some of this cool stuff.


    IIS 7.0 Hostable WebCore feature allows you to host a "copy" of IIS in your own process. This is not your average "HttpListener" kind of solution where you will need to implement all the functionality for File downloads, Basic/Windows/Anonymous Authentication, Caching, Cgi, ASP, ASP.NET, Web Services, or anything else you need; Hostable WebCore will allow you to configure and extend in almost any way the functionality of your own Web Server without having to build any code.

  • CarlosAg Blog

    Using LINQ with Microsoft.Web.Administration


    With the upcoming release of .NET 3.5 and LINQ I thought it would be interesting to show some of the cool things you can do with IIS 7 and LINQ. Everything that I will do can be done with C# 2.0 code but it would take me several lines of code to write them but thanks to LINQ you can do them in about a line or two.

    Let's start with a very basic example that does not use LINQ but just M.W.A (Microsoft.Web.Administration) and then start adding interesting things to it.

    The following code just iterates the sites in IIS and displays their name.

    using System;
    using System.Linq;

    using Microsoft.Web.Administration;
    class Program {
        static void Main(string[] args) {
            using (ServerManager serverManager = new ServerManager()) {

                var sites = serverManager.Sites;
                foreach (Site site in sites) {
    Now, let's say I wanted to have them sorted by their name. This is where LINQ starts being useful
        using (ServerManager serverManager = new ServerManager()) {

            var sites = (from site in serverManager.Sites
                orderby site.Name
                select site);

            foreach (Site site in sites) {
    Say you want to start all the sites that are stopped:
        using (ServerManager serverManager = new ServerManager()) {

            var sites = (from site in serverManager.Sites
                where site.State == ObjectState.Stopped
                orderby site.Name
                select site);

            foreach (Site site in sites) {
    OK, now let's imagine I want to find all the applications that are configured to run in the Default ApplicationPool and move them to run in my NewAppPool. This would take me a lot more lines of code but now I can just do:
        using (ServerManager serverManager = new ServerManager()) {

            var apps = (from site in serverManager.Sites
                from app in site.Applications
                where app.ApplicationPoolName.Equals("DefaultAppPool", StringComparison.OrdinalIgnoreCase)
                select app);

            foreach (Application app in apps) {
                app.ApplicationPoolName = "NewAppPool";

    Now let's say I want to find the top 20 distinct URL's of all the requests running in all my worker processes that has taken more than 1 second.
        using (ServerManager serverManager = new ServerManager()) {

            var requests = (
                from wp in serverManager.WorkerProcesses
                from request in wp.GetRequests(1000)
                orderby request.TimeElapsed descending
                select request).Distinct().Take(20);

            foreach (Request request in requests) {
    OK, finally let's say I want to display a table of all the applications running under DefaultAppPool and display if Anonymous authentication is enabled or not. (Now this one is almost on the edge of "you should do it differently, but it is Ok if you are only reading a single value from the section):
        using (ServerManager serverManager = new ServerManager()) {

            var items = from site in serverManager.Sites
                from app in site.Applications
                where app.ApplicationPoolName.Equals("DefaultAppPool", StringComparison.OrdinalIgnoreCase)
                orderby site.Name, app.Path
                select new {
                    Site = site,
                    Application = app,
                    AnoymousEnabled = ((bool)app.GetWebConfiguration().GetSection("system.webServer/security/authentication/anonymousAuthentication")["enabled"])

            foreach (var item in items) {
                Console.WriteLine("Site:{0,-18} App:{1, -10} Anonymous Enabled:{2}",
                    item.Site.Name, item.Application.Path, item.AnoymousEnabled);
    As you can see LINQ is an incredibly useful feature in C# 3.0 and in conjunction with Microsoft.Web.Administration allows you to do incredibly complex operations in IIS with just few lines of code.
  • CarlosAg Blog

    Free SEO Analysis using IIS SEO Toolkit


    In my spare time I’ve been thinking about new ideas for the SEO Toolkit, and it occurred to me that rather than continuing trying to figure out more reports and better diagnostics against some random fake sites, that it could be interesting to ask openly for anyone that is wanting a free SEO analysis report of your site and test drive some of it against real sites.

    • So what is in it for you, I will analyze your site to look for common SEO errors, I will create a digest of actions to do and other things (like generating a diagram of your site, layer information, etc), and will deliver it to you in a form of an email. If you agree I will post some of the results (hiding identification information like site, url, etc, so that it is made anonymously if needed).
    • and what is in it for me, well I will crawl your Web Site (once or twice at most, with a limit set to a few hundred pages) using the SEO Toolkit to test drive some ideas and reporting stuff that I’m starting to build and to continue investigating common patterns and errors.

    So if you want in, just post me your URL in the comments of this blog (make sure you are reading this blog from a URL inside , otherwise you might be posting comments in some syndicating site.), I will only allow the first few sites (if successful I will start another batch in the future) and I will be doing one by one in the following days. Make sure to include a way to contact you whether using the MSDN user infrastructure or include an email so that I can contact you with the results.

    Alternatively I will take also URLs using Twitter at so hurry up and let me know if you want me to look at your site.

  • CarlosAg Blog

    IIS SEO Toolkit - Start new analysis automatically through code


    One question that I've been asked several times is: "Is it possible to schedule the IIS SEO Toolkit to run automatically every night?". Other related questions are: "Can I automate the SEO Toolkit so that as part of my build process I'm able to catch regressions on my application?", or "Can I run it automatically after every check-in to my source control system to ensure no links are broken?", etc.

    The good news is that the answer is YES!. The bad news is that you have to write a bit of code to be able to make it work. Basically the SEO Toolkit includes a Managed code API to be able to start the analysis just like the User Interface does, and you can call it from any application you want using Managed Code.

    In this blog I will show you how to write a simple command application that will start a new analysis against the site provided in the command line argument and process a few queries after finishing.

    IIS SEO Crawling APIs

    The most important type included is a class called WebCrawler. This class takes care of all the process of driving the analysis. The following image shows this class and some of the related classes that you will need to use for this.


    The WebCrawler class is initialized through the configuration specified in the CrawlerSettings. The WebCrawler class also contains two methods Start() and Stop() which starts the crawling process in a set of background threads. With the WebCrawler class you can also gain access to the CrawlerReport through the Report property. The CrawlerReport class represents the results (whether completed or in progress) of the crawling process. It has a method called GetUrls() that returns an instance to all the UrlInfo items. A UrlInfo is the most important class that represents a URL that has been downloaded and processed, it has all the metadata such as Title, Description, ContentLength, ContentType, and the set of Violations and Links that it includes.

    Developing the Sample

    1. Start Visual Studio.
    2. Select the option "File->New Project"
    3. In the "New Project" dialog select the template "Console Application", enter the name "SEORunner" and press OK.
    4. Using the menu "Project->Add Reference" add a reference to the IIS SEO Toolkit Client assembly "c:\Program Files\Reference Assemblies\Microsoft\IIS\Microsoft.Web.Management.SEO.Client.dll".
    5. Replace the code in the file Program.cs with the code shown below.
    6. Build the Solution
    using System;
    using System.IO;
    using System.Linq;
    using System.Net;
    using System.Threading;
    using Microsoft.Web.Management.SEO.Crawler;

    namespace SEORunner {
    class Program {

    static void Main(string[] args) {

    if (args.Length != 1) {
    Console.WriteLine("Please specify the URL.");

    // Create a URI class
                Uri startUrl = new Uri(args[0]);

    // Run the analysis
                CrawlerReport report = RunAnalysis(startUrl);

    // Run a few queries...



    private static CrawlerReport RunAnalysis(Uri startUrl) {
    CrawlerSettings settings = new CrawlerSettings(startUrl);
    settings.ExternalLinkCriteria = ExternalLinkCriteria.SameFolderAndDeeper;
    // Generate a unique name
                settings.Name = startUrl.Host + " " + DateTime.Now.ToString("yy-MM-dd hh-mm-ss");

    // Use the same directory as the default used by the UI
                string path = Path.Combine(
                    "IIS SEO Reports"

    settings.DirectoryCache = Path.Combine(path, settings.Name);

    // Create a new crawler and start running
                WebCrawler crawler = new WebCrawler(settings);

    Console.WriteLine("Processed - Remaining - Download Size");
    while (crawler.IsRunning) {
    Console.WriteLine("{0,9:N0} - {1,9:N0} - {2,9:N2} MB",
    crawler.BytesDownloaded / 1048576.0f);

    // Save the report

    Console.WriteLine("Crawling complete!!!");

    return crawler.Report;

    private static void LogSummary(CrawlerReport report) {
    Console.WriteLine(" Overview");
    Console.WriteLine("Start URL:  {0}", report.Settings.StartUrl);
    Console.WriteLine("Start Time: {0}", report.Settings.StartTime);
    Console.WriteLine("End Time:   {0}", report.Settings.EndTime);
    Console.WriteLine("URLs:       {0}", report.GetUrlCount());
    Console.WriteLine("Links:      {0}", report.Settings.LinkCount);
    Console.WriteLine("Violations: {0}", report.Settings.ViolationCount);

    private static void LogBrokenLinks(CrawlerReport report) {
    Console.WriteLine(" Broken links");
    foreach (var item in from url in report.GetUrls()
    where url.StatusCode == HttpStatusCode.NotFound &&
    orderby url.Url.AbsoluteUri ascending
    select url) {

    private static void LogStatusCodeSummary(CrawlerReport report) {
    Console.WriteLine(" Status Code summary");
    foreach (var item in from url in report.GetUrls()
    group url by url.StatusCode into g
    orderby g.Key
    select g) {
    Console.WriteLine("{0,20} - {1,5:N0}", item.Key, item.Count());


    If you are not using Visual Studio, you can just save the contents above in a file, call it SEORunner.cs and compile it using the command line:

    C:\Windows\Microsoft.NET\Framework\v3.5\csc.exe /r:"c:\Program Files\Reference Assemblies\Microsoft\IIS\Microsoft.Web.Management.SEO.Client.dll" /optimize+ SEORunner.cs


    After that you should be able to run SEORunner.exe and pass the URL of your site as a argument, you will see an output like:

    Processed - Remaining - Download Size
           56 -       149 -      0.93 MB
          127 -       160 -      2.26 MB
          185 -       108 -      3.24 MB
          228 -        72 -      4.16 MB
          254 -        48 -      4.98 MB
          277 -        36 -      5.36 MB
          295 -        52 -      6.57 MB
          323 -        25 -      7.53 MB
          340 -         9 -      8.05 MB
          358 -         1 -      8.62 MB
          362 -         0 -      8.81 MB
    Crawling complete!!!
    Start URL:
    Start Time: 11/16/2009 12:16:04 AM
    End Time:   11/16/2009 12:16:15 AM
    URLs:       362
    Links:      3463
    Violations: 838
     Status Code summary
                      OK -   319
        MovedPermanently -    17
                   Found -    23
                NotFound -     2
     InternalServerError -     1
     Broken links


    The most interesting method above is RunAnalysis, it creates a new instance of the CrawlerSettings and specifies the start URL. Note that it also specifies that we should consider internal all the pages that are hosted in the same directory or subdirectories. We also set the a unique name for the report and use the same directory as the IIS SEO UI uses so that opening IIS Manager will show the reports just as if they were generated by it. Then we finally call Start() which will start the number of worker threads specified in the WebCrawler::WorkerCount property. We finally just wait for the WebCrawler to be done by querying the IsRunning property.

    The remaining methods just leverage LINQ to perform a few queries to output things like a report aggregating all the URLs processed by Status code and more.


    As you can see the IIS SEO Toolkit crawling APIs allow you to easily write your own application to start the analysis against your Web site which can be easily integrated with the Windows Task Scheduler or your own scripts or build system to easily allow for continuous integration.

    Once the report is saved locally it can then be opened using IIS Manager and continue further analysis as with any other report. This sample console application can be scheduled using the Windows Task Scheduler so that it can run every night or at any time. Note that you could also write a few lines of PowerShell to automate it without the need of writing C# code and do that by only command line, but that is left for another post.

  • CarlosAg Blog

    Winsxs is huge… Free up a few Gigabytes with dism


    I was running out of disk space in C: and was unable to install a small software that I needed, so I decided to clean up a bit. For that I like using WinDirStat which very quickly allows you to find where the big files/folders are. In this case I found that my c:\Windows\winsxs folder was over 12 GB of size. One way to reclaim some of that disk space is to cleanup all files that have been backed up when a Service Pack has been installed. To do that in Windows 7 you can run the following DISM command:

    dism /online /cleanup-image /spsuperseded /hidesp

    That freed up 4 GB in my machine and now I can move on.

    Disclaimer: I only ran this in my Windows 7 machine and it worked great, have not tried it in Server SKUs so run at your own risk.

  • CarlosAg Blog

    SEO made easy with IIS URL Rewrite 2.0 SEO templates


    A few weeks ago my team released the version 2.0 of the URL Rewrite for IIS. URL Rewrite is probably the most powerful Rewrite engine for Web Applications. It gives you many features including Inbound Rewriting (ie. Rewrite the URL, Redirect to another URL, Abort Requests, use of Maps, and more), and in Version 2.0 it also includes Outbound Rewriting so that you can rewrite URLs or any markup as the content is being sent back even if its generated using PHP, ASP.NET or any other technology.

    It also includes a very powerful User Interface that allows you to test your regular expressions and even better it includes a set of templates for common types of Rules. Some of those rules are incredibly valuable for SEO (Search Engine Optimization) purposes. The SEO rules are:

    1. Enforce Lowercase URLs. It will make sure that every URL is used with only lower case and if not it will redirect with a 301 to the lower-case version.
    2. Enforce a Canonical Domain Name. It will help you specify what domain name you want to use for your site and it will redirect the traffic to the right host name.
    3. Append or Remove the Trailing Slash. It will make sure your request either include or not include the trailing slash depending on your preference.


    For more information on the SEO Templates look at:

    What is really cool is that you can use the SEO Toolkit to run it against your application and you probably will get some violations around lower-case, or canonical domains, etc. And after seeing those you can use URL Rewrite 2.0 to fix them with one click.

    I have personally used it in my Web site, try the following three URLs and all of them will be redirected to the canonical form ( and you will see URL Rewrite in action:


    Note that at the end those templates just translate to web.config settings that become part of your application that can be XCOPY with it. This works with ASP.NET, PHP, or any other server technology including static files. Below is the output of the Canonical Host Name rule which I use on my Web site’s web.config.

    <?xml version="1.0" encoding="UTF-8"?>
    <rule name="CanonicalHostNameRule1">
    <match url="(.*)" />
    <add input="{HTTP_HOST}" pattern="^www\.carlosag\.net$" negate="true" />
    <action type="Redirect" url="{R:1}" />

    There are many more features that I could talk, but for now this was just a quick SEO related post.

  • CarlosAg Blog

    IIS Reports for IIS Manager 7


    I have just uploaded a new application that extends IIS Manager 7 for Windows Vista and Windows Longhorn Server that adds a new Reports option that gives you a few reports of the server and site activity. Its features include:

    • Reports are scoped on the object selected in IIS Manager, so if server is selected you get a server report including all Sites information, if a site is selected you only get information related to that specific site
    • Export to HTML
    • Printing support
    • Different chart options: Pie, Columns and Lines
    • Built-in Reports:
      • Status Code, Hits Per Url, Hits by Hour, User Agent, File Extensions
      • Users, Time Taken, Win32 Errors, Client Machine, Http Method

    Click Here to go to the Download Page

    IIS Reports

    I'm working on a second version that will allow you to create your own queries and configure more options like Chart settings, and ore.

    If you have any suggestions on reports that would be useful feel free to add them as comment to the post.

Page 2 of 10 (94 items) 12345»