Posts
  • CarlosAg Blog

    Microsoft.Web.Administration in IIS 7

    • 58 Comments
    While creating the new administration stack in IIS 7, we were looking into the different ways users could manipulate the server configuration as well as the new runtime information available in IIS 7 (Internally we call this RSCA-Runtime State and Control API) from managed code, and we realized we needed to provide a simpler and more straight forward API that developers could consume from managed code. Microsoft.Web.Administration is the answer to this problem. This API is designed to be simple to code against in an “intellisense-driven” sort of way. At the root level a class called ServerManager exposes all the functionality you will need.

    To show the power and simplicity of this API, let’s look at some samples below. To try this samples just create a new Console Application in Visual Studio and add a reference to Microsoft.Web.Administration.dll that can be found at IIS directory (%WinDir%\System32\InetSrv).
    Please note that the following code is based on Windows Vista Beta 2 code and will likely change for the release candidate versions of Windows Vista since we have planned several enhancements to simplify the API and expose more features into it.
    The following picture shows the main objects (excluding Configuration related classes).
     
    Microsoft.Web.Administration
    Creating a Site
     
    ServerManager iisManager = new ServerManager();
    iisManager.Sites.Add("NewSite""http""*:8080:""d:\\MySite");
    iisManager.Update();
    This basically creates an instance of the ServerManager class and uses the Add method in the Sites collection to create a new site named "NewSite" listening at port 8080 using http protocol and content files are at d:\MySite.
    One thing to note is that calling Update is a requirement since that is the moment when we persist the changes to the configuration store.
    After running this code you have now a site that you can browse using http://localhost:8080

    Adding an Application to a site
    ServerManager iisManager = new ServerManager();
    iisManager.Sites["NewSite"].Applications.Add("/Sales""d:\\MyApp");
    iisManager.Update();


    This sample uses the Sites collection Indexer to get NewSite site and uses the Applications collection to add a new http://localhost:8080/Sales application.

    Creating a Virtual Directory
    ServerManager iisManager = new ServerManager();
    Application app iisManager.Sites["NewSite"].Applications["/Sales"];
    app.VirtualDirectories.Add("/VDir""d:\\MyVDir");
    iisManager.Update();


    Runtime State and Control

    Now, moving on to the new Runtime state and control information we also expose in this objects information about their current state as well as the ability to modify them. For example, we expose the list of W3WP processes running (Worker processes) and what I think is really cool, we even expose the list of requests currently running. Stopping a Web Site
    ServerManager iisManager = new ServerManager();
    iisManager.Sites["NewSite"].Stop();

    Recyciling an Application Pool
    ServerManager iisManager = new ServerManager();
    iisManager.ApplicationPools["DefaultAppPool"].Recycle();

    Getting the list of executing requests
    ServerManager iisManager = new ServerManager();
    foreach
    (WorkerProcess w3wp in iisManager.WorkerProcesses) {
        Console.WriteLine(
    "W3WP ({0})", w3wp.ProcessId);
                
        foreach 
    (Request request in w3wp.GetRequests(0)) {
            Console.WriteLine(
    "{0} - {1},{2},{3}",
                        request.Url,
                        request.ClientIPAddr,
                        request.TimeElapsed,
                        request.TimeInState)
    ;
        
    }
    }
    Another big thing on this API is the ability to edit the “.config” files using a simple API, this includes the ability of modifying the main applicationHost.config file from IIS, web.config files from asp.net as well as machine.config and other config files (such as administration.config). However I will talk about them in a future post.
  • CarlosAg Blog

    Generating Excel Workooks without Excel

    • 53 Comments

    Why I wrote Excel Xml Writer
    One day I found myself having to build a Web Application that one of the requirements involved generating a nice Excel Workbook that users could then play with. This required quite some styling and several other features that you can only do with Excel (like setting printer options and document properties).
     
    Well, coming from Consulting, this requirement was no strange at all to me, and I had to dealt with this problem many times years ago. However then, it was a different story, Managed code and Xml were not even created, and COM was the only option I had. The only solution was using Excel Automation to build Workbooks that supported all the features I required. Yes, I know HTML could do the trick to just generate a table and set the content type to an Excel Application but this certainly leaves you lacking of control in several Excel features like document properties (Author, custom properties, etc), printer settings and more.
     
    Excel Automation
    If you ever worked with Excel Automation you know that it is an extremely powerful (and complicated) object model. However, this power does not come for free, every time you create an Excel.Application, you are essentially running a new Excel.exe instance which is nothing but cheap, and for that reason (and many more) you certainly do not want to do that in a Web Application where thousands of users might run the nice “Export to Excel” link and you end up with thousand of processes being created and destroyed.
     
    Just to illustrate my point, I created the following sample.
     
    C#
    using System;
    using 
    Excel Microsoft.Office.Interop.Excel;
    using 
    System.Runtime.InteropServices;
    using 
    Missing System.Reflection.Missing;

    static class 
    Program {

        
    static void Main() {
            
    int tick Environment.TickCount;
            
    // Create the Excel Application
            
    Excel.Application excel = new Excel.Application();

            try 
    {
                
    // make it visible for demostration purposes
                
    excel.Visible = true;

                
    // Add a Workbook
                
    Excel.Workbook workbook excel.Workbooks.Add(Missing.Value);

                
    // Set the author
                
    workbook.Author "CarlosAg";

                
    // Create a Style
                
    Excel.Style style workbook.Styles.Add("Style1", Missing.Value);
                
    style.Font.Bold = true;

                
    // Add a new Worksheet
                
    Excel.Worksheet sheet =
                    
    (Excel.Worksheet)workbook.Worksheets.Add(Missing.Value, Missing.Value, Missing.Value, Missing.Value);

                
    // Set some text to a cell
                
    Excel.Range range ((Excel.Range)sheet.Cells[11]);
                
    range.Style style;
                
    range.Value2 "Hello World";

                
    workbook.SaveAs(@"c:\test.xls", Missing.Value, Missing.Value, Missing.Value,
                                Missing.Value, Missing.Value, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlExclusive,
                                Missing.Value, Missing.Value, Missing.Value, Missing.Value, Missing.Value)
    ;

                
    // Finally close the Workbook and save it
                
    workbook.Close(false, Missing.Value, Missing.Value);

                
    // Close Excel
                
    excel.Quit();

            
    }
            
    finally {
                
    // Make sure we release the reference to the underlying COM object
                
    Marshal.ReleaseComObject(excel);
            
    }

            Console.WriteLine(
    "Time: {0}", Environment.TickCount - tick);
        
    }
    }

    Colorized by: CarlosAg.CodeColorizer Note: If you want to run the application you need to add a Reference to Microsoft Excel COM library.
     
    Well, I ran this really simple application cold in my machine that has Office 2003, and running it took almost 3000 milliseconds. Of course if you run it again it runs in about 1 second. But this solution will just not scale in a Web Application.
     
    Another big problem with this approach is the code itself, if you take a close look I had to type almost 20 references to Missing.Value.
     
    Solution
    Luckily ever since Office XP, Excel supports a new file format called Xml Workbook (or something like that), this allow you to create an Xml document that follows a certain schema and Excel will work as if it were the binary XLS format (though not all features at supported like Charts).
     
    Now I have new options, I could just generate the Xml using an XmlDocument or even better using an XmlWriter; but doing so it is quite cumbersome, since you need to understand a lot of Xml, Schemas, and Namespaces and it is quite probably that you will mess up something like closing an element incorrectly or adding the wrong namespace, or prefix, etc.
     
    For that reason I thought to build a lightweight fast wrapper to the Excel Xml Workbook schema. This way my application manipulates an object model that looks similar to Excel Automation OM but it is lightweight, 100% managed, and that in the end serialize itself into an Xml using an XmlWriter.
     
    This is exactly what Excel Xml Writer is, just a simple object model that generates Xml following the Excel Xml Workbook schema. After almost done with it, I thought I could add the ability to load the Xml as well, so I added that feature. This turned out to be extremely useful when loading Excel worksheets from the Office Web Components, and really cool usage, so that you can embed workbooks in your page, and then use AJAX like technology to post the XMLData property back and load it in the server side to do the actual processing in your Database, etc.
     
    You can download it for free at http://www.carlosag.net/Tools/ExcelXmlWriter/Default.aspx.
     
    Now, the code to generate the same workbook we just did using my library the code looks like:
     
     
    using System;
    using 
    CarlosAg.ExcelXmlWriter;

    class 
    Program {
        
    static void Main(string[] args) {
            
    int ticks Environment.TickCount;

            
    // Create the workbook
            
    Workbook book = new Workbook();
            
    // Set the author
            
    book.Properties.Author "CarlosAg";

            
    // Add some style
            
    WorksheetStyle style book.Styles.Add("style1");
            
    style.Font.Bold = true;

            
    Worksheet sheet book.Worksheets.Add("SampleSheet");

            
    WorksheetRow Row0 sheet.Table.Rows.Add();
            
    // Add a cell
            
    Row0.Cells.Add("Hello World", DataType.String, "style1");

            
    // Save it
            
    book.Save(@"c:\test.xls");

            
    Console.WriteLine("Time:{0}", Environment.TickCount - ticks);
        
    }
    }

    Colorized by: CarlosAg.CodeColorizer

    Several differences:
    1)      You don’t actually need Excel installed in your server to run this program since it does not uses Excel at all, just Xml.
    2)      Working set of your application is way smaller than using the Interop libraries
    3)      This is more than 100 times faster to run.
    4)      Code looks much more simpler.
    5)      Since it generates Xml, you can actually stream it directly in an ASP.NET application to the Response.OutputStream without ever saving it into the file system.
    6)      This solution will scale to thousands of users since it does not require any creation of processes.
     
    Now, even better I decided to write a code generator tool so that you don’t need to write all the styling code and superficial stuff and just focus on the actual data. This tool allows you to open an Excel Xml Workbook that you have created in Excel, and it will generate the C# or VB.NET code that you can use to generate it. This means that you can create the skeleton with all the formatting options in Excel and then just generate the code.
     
     
    Conclusion
    Don’t get me wrong Excel team did an awesome job with the Automation support for all Office products, however this has been around for several years, but it definitely lacks of support for Web based applications (asside from Office Web Components). Luckily they are addressing this in the next release of Office where they will have some awesome support for server side applications and many exciting stuff.
     
    In the mean time you might find really exciting working with Xml in Office since they have great support and it will only get better with time.
  • CarlosAg Blog

    IIS Admin Pack Technical Preview 1 Released

    • 34 Comments

    NOTE: IIS Admin Pack Technical Preview 2 has been released: http://blogs.msdn.com/carlosag/archive/2008/05/13/IISAdminPackTechnicalPreview2Released.aspx

    I'm really exited to announce that today we released the Technical Preview of the IIS Admin Pack and it includes 7 new features for IIS Manager that will help you in a bunch of different scenarios.

    Download

    You can download the IIS 7.0 Admin Pack Technical Preview from (It requires less than 1MB):

    (x86) http://www.iis.net/downloads/default.aspx?tabid=34&g=6&i=1646
    (x64) http://www.iis.net/downloads/default.aspx?tabid=34&g=6&i=1647

    Documentation

    http://learn.iis.net/page.aspx/401/using-the-administration-pack/

    These UI modules include the following features:

    • Request Filtering UI - This UI exposes the configuration of the IIS runtime feature called Request Filtering.
    • Configuration Editor UI - This UI provides an advanced generic configuration editor entirely driven by our configuration schema. It includes things like Script Generation, Search functionality, advanced information such as locking and much more.
    • Database Manager UI - This UI allows you to manage SQL Server databases from within IIS Manager, including the ability to create tables, execute queries, add indexes, primary keys, query data, insert rows, delete rows, and much more.
    • IIS Reports UI - This extensible platform exposes a set of reports including some log parser based reports, displaying things like Top URL's, Hits per User, Page Performance, and many more.
    • FastCGI UI - This UI exposes the configuration for the FastCGI runtime feature.
    • ASP.NET Authorization UI - This UI allows you to configure the ASP.NET authorization settings.
    • ASP.NET Custom Errors UI - This UI allows you to configure the Custom errors functionality of ASP.NET

    Please, help us, we want to ask for your help on trying them and give us feedback of all these modules, do they work for you? what would you change? what would you add? What features are we missing?

    Some things to think about,

    Database Manager, what other database features are critical for you to build applications?

    IIS Reports set of reports, what reports would you find useful?, would you want to have Configuration based reports (such as summarizing the Sites and their configuration, which configuration)? More Security Reports (such as)?

    Configuration Editor, is it easy to use?, what concepts from configuration would you like to see?, etc

    Given that each individual feature above has a lot of interesting features that can easily be missed, or might be confusing, I will be blogging in the near feature talking about why we decided to build each feature, what makes them different from any other thing you've seen as well as how you can make the most out of each of them.

    NOTE: IIS Admin Pack Technical Preview 2 has been released: http://blogs.msdn.com/carlosag/archive/2008/05/13/IISAdminPackTechnicalPreview2Released.aspx

    Carlos

  • CarlosAg Blog

    IIS 7.0 and URL Rewrite, make your Web Site SEO

    • 29 Comments

    In the past few days I've been reading a bit about SEO and trying to understand more about what makes a Web Site be SEO (Search-Engine-Optimized) and what are some of the typical headaches when trying to achieve that as well as how we can implement them in IIS.

    Today I decided to post how you can make your Web Site running IIS 7.0 a bit "friendlier" to Search Engines without having to modify any code in your application. Being SEO is a big statement since it can include several things, so for now I will scope the discussion to 3 things that can be easily addressed using the IIS URL Rewrite Module:

    1. Canonicalization
    2. Friendly URL's
    3. Site Reorganization

    1) Canonicalization

    Basically the goal of canonicalization is to ensure that the content of a page is only exposed as a unique URI. The reason this is important is because even though for humans it's easy to tell that http://www.carlosag.net is the same as http://carlosag.net, many search engines will not make any assumptions and keep them as two separate entries, potentially splitting the rankings of them lowering their relevance. Another example of this is http://www.carlosag.net/default.aspx and http://www.carlosag.net/. You can certainly minimize the impact of this by writing your application using the canonical forms of your links, for example in your links you can always link to the right content for example: http://www.carlosag.net/tools/webchart/ and remove the default.aspx, however that only accounts for part of the equation since you cannot assume everyone referencing your Web Site will follow this carefully, you cannot control their links.

    This is when URL Rewrite comes into play and truly solves this problem.

    Host name.

    URL Rewrite can help you redirect when the users type your URL in a way you don't unnecessarily want them to, for example just carlosag.net. Choosing between using WWW or not is a matter of taste but once you choose one you should ensure that you guide everyone to the right one. The following rule will automatically redirect everyone using just carlosag.net to www.carlosag.net. This configuration can be saved in the Web.config file in the root of your Web Site.Note that I'm only including the XML in this blog, however I used IIS Manager to generate all of these settings so you don't need to memorize the XML schema since the UI includes several friendly capabilities to generate all of these..

    <configuration>
     
    <system.webServer>
       
    <rewrite>
         
    <rules>
           
    <rule name="Redirect to WWW" stopProcessing="true">
             
    <match url=".*" />
              <
    conditions>
               
    <add input="{HTTP_HOST}" pattern="^carlosag.net$" />
              </
    conditions>
             
    <action type="Redirect" url="http://www.carlosag.net/{R:0}" redirectType="Permanent" />
            </
    rule>
         
    </rules>
       
    </rewrite>
     
    </system.webServer>
    </configuration>

    Note that one important thing is to use Permanent redirects (301) , this will ensure that if anybody links your page using a non-WWW link when the search engine bot crawls their Web Site it will identify the link as permanently moved and it will treat the new URL as the correct address and it will not index the old URL, which is the case when using Temporary (302) redirects. The following shows how the response of the server looks like:

    HTTP/1.1 301 Moved Permanently
    Content-Type: text/html; charset=UTF-8
    Location: http://www.carlosag.net/tools/
    Server: Microsoft-IIS/7.0
    X-Powered-By: ASP.NET
    Date: Mon, 01 Sep 2008 22:45:49 GMT
    Content-Length: 155

    <head><title>Document Moved</title></head>
    <body><h1>Object Moved</h1>This document may be found <a HREF=http://www.carlosag.net/tools/>here</a></body>

    Default Documents

    IIS has a feature called Default Document that allows you to specify the content that should be processed when a user enters a URL that is mapped to a directory and not an actual file. In other words, if the user enters http://www.carlosag.net/tools/ then they will actually get the content as if they entered http://www.carlosag.net/tools/default.aspx. That is all great, the problem is that this feature only works one way by mapping a Directory to a File, however it does not map the File to the Document, this means that if some of your links or other users enter the full URL, then search engines will see two different URL's. To solve that problem we can use a configuration very similar to the rule above, following is a rule that will redirect the default.aspx to the canonical URL (the folder).

            <rule name="Default Document" stopProcessing="true">
             
    <match url="(.*)default.aspx" />
              <
    action type="Redirect" url="{R:1}" redirectType="Permanent" />
            </
    rule>

    This again, uses a Permanent redirect to extract everything before Default.aspx and redirect it to the "parent" URL path, so for example, if the user enters http://www.carlosag.net/Tools/WindowsLiveWriter/default.aspx it will be redirected to http://www.carlosag.net/Tools/WindowsLiveWriter/ as well as http://www.carlosag.net/Tools/default.aspx to http://www.carlosag.net/Tools/. You can place this rule at the root of your site and it will take care of all the default documents (if you have a default.aspx in every folder)

    2) Friendly URL's

    Asking your user to remember that www.contoso.com/books.aspx?isbn=0735624410 is the URL for the IIS Resource Kit is not the nicest thing to do, first of all why do they care about this being an ASPX and the fact that it takes arguments and what not. It seems that providing them with a URL like www.contoso.com/books/IISResourceKit will truly resonate with them and be easier for them to remember and pass along. Most importantly it really doesn't tie you to any Web technology.

    With URL Rewrite you can easily build this kind of logic automatically without having to modify your code using Rewrite Maps:

    <configuration>
     
    <system.webServer>
       
    <rewrite>
         
    <rules>
           
    <rule name="Rewrite for Books" stopProcessing="true">
             
    <match url="Books/(.+)" />
              <
    action type="Rewrite" url="books.aspx?isbn={Books:{R:1}}" />
            </
    rule>
         
    </rules>
         
    <rewriteMaps>
           
    <rewriteMap name="Books">
             
    <add key="IISResourceKit" value="0735624410" />
              <
    add key="ProfessionalIIS7" value="0470097825" />
              <
    add key="IIS7AdministratorsPocketConsultant" value="0735623643" />
              <
    add key="IIS7ImplementationandAdministration" value="0470178930" />
            </
    rewriteMap>
         
    </rewriteMaps>
       
    </rewrite>
     
    </system.webServer>
    </configuration>

    The configuration above includes a rule that uses a Rewrite Map to translate a URL like: http://www.contoso.com/books/IISResourceKit into http://www.contoso.com/books.aspx?isbn=0735624410 automatically. Using maps is a very convenient way to have a "table" of values that can be transformed into any other value to be used in the result URL. Of course there are better ways of doing this when using large catalogs or values that change frequently but is extremely useful when you have a consistent set of values or when you can't make changes to an existing application. Note that since we use Rewrite the end users never see the "ugly-URL" unless they knew it already and typed it, and of course this means you can use the inverse approach to ensure the canonicalization is preserved:

        <rewrite>
         
    <rules>
           
    <rule name="Redirect Books to Canonical URL" stopProcessing="true">
             
    <match url="books\.aspx" />
              <
    action type="Redirect" url="Books/{ISBN:{C:1}}" appendQueryString="false" />
              <
    conditions>
               
    <add input="{QUERY_STRING}" pattern="isbn=(.+)" />
              </
    conditions>
           
    </rule>
         
    </rules>
         
    <rewriteMaps>
           
    <rewriteMap name="ISBN">
             
    <add key="0735624410" value="IISResourceKit" />
              <
    add key="0470097825" value="ProfessionalIIS7" />
              <
    add key="0735623643" value="IIS7AdministratorsPocketConsultant" />
              <
    add key="0470178930" value="IIS7ImplementationandAdministration" />
            </
    rewriteMap>
         
    </rewriteMaps>
       
    </rewrite>

    The rule above does the "inverse" by matching the URL books.aspx, extracting the ISBN query string value and doing a lookup in the ISBN table and redirecting the client to the canonical URL, so again if user enters http://www.contoso.com/books.aspx?isbn=0735624410 they will be redirected to http://www.contoso.com/books/IISResourceKit.

    This Friendly URL to me is more of a user feature than a SEO feature, however I've read in every SEO guide to reduce the number of parameters in your Query String, however, I have not find yet any document that clearly states if there is truly a limit in the search engine bot's that would truly impact the search relevance. I guess it makes sense that they wouldn't keep track of thousands of links to a catalog.aspx that has zillions of permutations based on hundreds of values in the query string (category, department, price range, etc) even if all of them were linked, but again I don't have any prove.

    3) Site Reorganization

    One complex tasks that Web Developers face sometimes is trying to reorganize their current Web Site structure, whether its moving a section to a different path, or something as simple as renaming a single file, you need to take into consideration things like, Is this move a temporary thing?, How do I ensure old clients get the new URL?, How do I prevent losing the search engine relevance?. URL Rewrite will help you perform these tasks.

    Rename a file

    If you rename a file you can very easily just write a Rewrite or Redirect Rule that ensures that your users continue getting the content. If your intent is to never go back to the old name you should use a Redirect Permanent so everyone starts getting the new content with its new "Canonical URL", however, if this could be a temporary thing you should use a Redirect Temporary. Finally a Rewrite is useful if you still want both URL's to continue to be valid (though this breaks the canonicality).

          <rule name="Rename File.php to MyFile.aspx" stopProcessing="true">
             
    <match url="File\.php" />
              <
    action type="Redirect" url="MyFile.aspx" redirectType="Permanent" />
          </
    rule>

    Moving directories

    Another common scenario is when you need to move an entire directory to another place of the Web Site. It could also be that based on some criteria (say Mobile browsers or other User Agent) get a different set of pages/images. Either way, URL rewrite helps with this. The following configuration will redirect every call to the /Images directory to the /NewImages directory.

          <rule name="Move Images to NewImages" stopProcessing="true">
             
    <match url="^images/(.*)" />
              <
    action type="Redirect" url="NewImages/{R:1}" redirectType="Permanent" />
          </
    rule>

    A related scenario is if you wanted to show different smaller images whenever a user of Windows CE was accessing your site, you could have a "img" directory where all the small images are stored and use a rule like the following:

            <rule name="Use Small Images for Windows CE" stopProcessing="true">
             
    <match url="^images/(.*)" />
              <
    action type="Rewrite" url="/img/{R:1}" />
              <
    conditions>
               
    <add input="{HTTP_USER_AGENT}" pattern="Windows CE" ignoreCase="false" />
              </
    conditions>
           
    </rule>

    Note, that in this case the use of Rewrite makes sense since we want the small images to look as the original images to the browser and it will save a "round-trip" to it.

    Moving multiple files

    Another common operation is when you randomly need to relocate pages for whatever reason (such as Marketing Campaigns, Branding, etc). In this case if you have several files that have been moved or renamed you can have a single rule that catches all of those and redirects them accordingly. Similarly, another sample could include an incremental migration from one technology to another where say you are moving from Classic ASP to ASP.NET and as you rewrite some of the old ASP pages into ASPX pages you want to start serving them without breaking any links or the search engine relevance.

        <rewrite>
         
    <rules>
           
    <rule name="Redirect Old Files and Broken Links" stopProcessing="true">
             
    <match url=".*" />
              <
    conditions>
               
    <add input="{OldFiles:{REQUEST_URI}}" pattern="(.+)" />
              </
    conditions>
             
    <action type="Redirect" url="{C:0}" />
            </
    rule>
         
    </rules>
         
    <rewriteMaps>
           
    <rewriteMap name="OldFiles">
             
    <add key="/tools/WebChart/sample.asp" value="tools/WebChart/sample.aspx" />
              <
    add key="/tools/default.asp" value="tools/" />
              <
    add key="/images/brokenlink.jpg" value="/images/brokenlink.png" />
            </
    rewriteMap>
         
    </rewriteMaps>
       
    </rewrite>

    Now, you can just keep adding to this table any broken link and specify its new address.

    Others

    Other potential use of URL Rewrite is when using RIA applications in the browser, whether using things like AJAX, Silverlight or Flash, that are not easy to parse and index by search engines, you could use URL Rewrite to rewrite the URL to static HTML versions of your content, however you should make sure that the content is consistent so you don't misguide users and search engines. For example the following rule will rewrite all the files in the RIAFiles table to their static HTML counterpart but only if the User Agent is the MSNBot or the GoogleBot:

        <rewrite>
         
    <rules>
           
    <rule name="Rewrite RIA Files" stopProcessing="true">
             
    <match url=".*" />
              <
    conditions>
               
    <add input="{HTTP_USER_AGENT}" pattern="MSNBot|Googlebot" />
                <
    add input="{RIAFiles:{REQUEST_URI}}" pattern="(.+)" />
              </
    conditions>
             
    <action type="Rewrite" url="{C:0}" />
            </
    rule>
         
    </rules>
         
    <rewriteMaps>
           
    <rewriteMap name="RIAFiles">
             
    <add key="/samples/Silverlight.aspx" value="/samples/Silverlight.htm" />
              <
    add key="/samples/MyAjax.aspx" value="/samples/MyAjax.htm" />
            </
    rewriteMap>
         
    </rewriteMaps>
       
    </rewrite>

    Related to this is that you might want to prevent search engines from crawling certain files (or your entire site), for that, you can use the Robots.txt semantics and use a "disallow", however, you can also use URL Rewrite to prevent this with more functionality such as blocking only a specific user agent:

        <rewrite>
         
    <rules>
           
    <rule name="Prevent access to files" stopProcessing="true">
             
    <match url=".*" />
              <
    conditions>
               
    <add input="{HTTP_USER_AGENT}" pattern="SomeRandomBot" />
                <
    add input="{NonIndexedFiles:{REQUEST_URI}}" pattern="(.+)" />
              </
    conditions>
             
    <action type="AbortRequest" />
            </
    rule>
         
    </rules>
         
    <rewriteMaps>
           
    <rewriteMap name="NonIndexedFiles">
             
    <add key="/profile.aspx" value="block" />
              <
    add key="/personal.aspx" value="block" />
            </
    rewriteMap>
         
    </rewriteMaps>
       
    </rewrite>

    There are several other things you can do to ensure that your Web Site is friendly with Search Engines, however most of them require changes to your application, but certainly worth the effort, for example:

    • Ensure your HTML includes a <title> tag.
    • Ensure your HTML includes a <meta name="description".
    • Use the correct HTML semantics, use H1 once and only once, use the alt attribute in your <img>, use <noscript> etc.
    • Redirect using status code 301 and not 302.
    • Provide Site Map's and/or Robots.txt.
    • Beware of POST backs and links that require script to run. 

    Resources

    For this entry I read and used some of the resources at several Web Sites, including:

  • CarlosAg Blog

    IIS SEO Toolkit – Crawler Module Extensibility

    • 27 Comments

     

    Sample SEO Toolkit CrawlerModule Extensibility

    In this blog we are going to write an example on how to extend the SEO Toolkit functionality, so for that we are going to pretend our company has a large Web site that includes several images, and now we are interested in making sure all of them comply to a certain standard, lets say all of them should be smaller than 1024x768 pixels and that the quality of the images is no less than 16 bits per pixel. Additionally we would also like to be able to make custom queries that can later allow us to further analyze the contents of the images and filter based on directories and more.

    For this we will extend the SEO Toolkit crawling process to perform the additional processing for images, we will be adding the following new capabilities:

    1. Capture additional information from the Content. In this case we will capture information about the image, in particular we will extend the report to add a "Image Width", "Image Height" and a "Image Pixel Format".
    2. Flag additional violations. In this example we will flag three new violations:
      1. Image is too large. This violation will be flagged any time the content length of the image is larger than the "Maximum Download Size per URL" configured at the start of the analysis. It will also flag this violation if the resolution is larger than 1024x768.
      2. Image pixel format is too small. This violation will be flagged if the image is 8 or 4 bits per pixel.
      3. Image has a small resolution. This will be flagged if the image resolution per inch is less than 72dpi.

    Enter CrawlerModule

    A crawler module is a class that extends the crawling process in Site Analysis to provide custom functionality while processing each URL. By deriving from this class you can easily raise your own set of violations or add your own data and links to any URL.

    public abstract class CrawlerModule : IDisposable
    {
       
    // Methods
       
    public virtual void BeginAnalysis();
        public virtual void EndAnalysis(bool cancelled);
       
    public abstract void Process(CrawlerProcessContext context);

       
    // Properties
        protected WebCrawler Crawler { get; }
       
    protected CrawlerSettings Settings { get; }
    }

    It includes three main methods:

    1. BeginAnalysis. This method is invoked once at the beginning of the crawling process and allows you to perform any initialization needed. Common tasks include registering custom properties in the Report that can be accessed through the Crawler property.
    2. Process. This method is invoked for each URL once its contents has been downloaded. The context argument includes a property URLInfo that provides all the metadata extracted for the URL. It also includes a list of Violations and Links in the URL. Common tasks include augmenting the metadata of the URL whether using its contents or external systems, flagging new custom Violations, or discovering new links in the contents.
    3. EndAnalysis. This method is invoked once at the end of the crawling process and allows you to do any final calculations on the report once all the URLs have been processed. Common tasks in this method include performing aggregations of data across all the URLs, or identifying violations that depend on all the data being available (such as finding duplicates).

    Coding the Image Crawler Module

    Create a Class Library in Visual Studio and add the code shown below.

    1. Open Visual Studio and select the option File->New Project
    2. In the New Project dialog select the Class Library project template and specify a name and a location such as "SampleCrawlerModule"
    3. Using the Menu "Project->Add Reference", add a reference to the IIS SEO Toolkit client library (C:\Program Files\Reference Assemblies\Microsoft\IIS\Microsoft.Web.Management.SEO.Client.dll).
    4. Since we are going to be registering this through the IIS Manager extensibility, add a reference to the IIS Manager extensibility DLL (c:\windows\system32\inetsrv\Microsoft.Web.Management.dll) using the "Project->Add Reference" menu.
    5. Also, since we will be using the .NET Bitmap class you need to add a reference to "System.Drawing" using the "Project->Add Reference" menu.
    6. Delete the auto-generated Class1.cs since we will not be using it.
    7. Using the Menu "Project->Add New Item" Add a new class named "ImageExtension".
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using Microsoft.Web.Management.SEO.Crawler;

    namespace SampleCrawlerModule {

       
    /// <summary>
        /// Extension to add validation and metadata to images while crawling
        /// </summary>
        internal class ImageExtension : CrawlerModule {
           
    private const string ImageWidthField = "iWidth";
           
    private const string ImageHeightField = "iHeight";
           
    private const string ImagePixelFormatField = "iPixFmt";

           
    public override void BeginAnalysis() {
               
    // Register the properties we want to augment at the begining of the analysis
                Crawler.Report.RegisterProperty(ImageWidthField, "Image Width", typeof(int));
               
    Crawler.Report.RegisterProperty(ImageHeightField, "Image Height", typeof(int));
               
    Crawler.Report.RegisterProperty(ImagePixelFormatField, "Image Pixel Format", typeof(string));
           
    }

           
    public override void Process(CrawlerProcessContext context) {
               
    // Make sure only process the Content Types we need to
                switch (context.UrlInfo.ContentTypeNormalized) {
                   
    case "image/jpeg":
                   
    case "image/png":
                   
    case "image/gif":
                   
    case "image/bmp":
                       
    // Process only known content types
                        break;
                   
    default:
                       
    // Ignore any other
                        return;
               
    }

               
    //--------------------------------------------
                // If the content length of the image was larger than the max
                //   allowed to download, then flag a violation, and stop
                if (context.UrlInfo.ContentLength >
                   
    Crawler.Settings.MaxContentLength) {
                   
    Violations.AddImageTooLargeViolation(context,
                        "It is larger than the allowed download size"
    );
                   
    // Stop processing since we do not have all the content
                    return;
               
    }

               
    // Load the image from the response into a bitmap
                using (Bitmap bitmap = new Bitmap(context.UrlInfo.ResponseStream)) {
                   
    Size size = bitmap.Size;

                   
    //--------------------------------------------
                    // Augment the metadata by adding our fields
                    context.UrlInfo.SetPropertyValue(ImageWidthField, size.Width);
                   
    context.UrlInfo.SetPropertyValue(ImageHeightField, size.Height);
                   
    context.UrlInfo.SetPropertyValue(ImagePixelFormatField, bitmap.PixelFormat.ToString());

                   
    //--------------------------------------------
                    // Additional Violations:
                    //
                    // If the size is outside our standards, then flag violation
                    if (size.Width > 1024 &&
                       
    size.Height > 768) {
                       
    Violations.AddImageTooLargeViolation(context,
                           
    String.Format("The image size is: {0}x{1}",
                                         
    size.Width, size.Height));
                   
    }

                   
    // If the format is outside our standards, then flag violation
                    switch (bitmap.PixelFormat) {
                       
    case PixelFormat.Format1bppIndexed:
                       
    case PixelFormat.Format4bppIndexed:
                       
    case PixelFormat.Format8bppIndexed:
                           
    Violations.AddImagePixelFormatSmall(context);
                           
    break;
                   
    }

                   
    if (bitmap.VerticalResolution <= 72 ||
                       
    bitmap.HorizontalResolution <= 72) {
                       
    Violations.AddImageResolutionSmall(context,
                           
    bitmap.HorizontalResolution + "x" + bitmap.VerticalResolution);
                   
    }
               
    }
           
    }

           
    /// <summary>
            /// Helper class to hold the violations
            /// </summary>
            private static class Violations {

               
    private static readonly ViolationInfo ImageTooLarge =
                   
    new ViolationInfo("ImageTooLarge",
                                     
    ViolationLevel.Warning,
                                      "Image is too large."
    ,
                                      "The Image is too large: {details}."
    ,
                                      "Make sure that the image content is required."
    ,
                                      "Images"
    );

               
    private static readonly ViolationInfo ImagePixelFormatSmall =
                   
    new ViolationInfo("ImagePixelFormatSmall",
                                     
    ViolationLevel.Warning,
                                      "Image pixel format is too small."
    ,
                                      "The Image pixel format is too small"
    ,
                                      "Make sure that the quality of the image is good."
    ,
                                      "Images"
    );

               
    private static readonly ViolationInfo ImageResolutionSmall =
                   
    new ViolationInfo("ImageResolutionSmall",
                                     
    ViolationLevel.Warning,
                                      "Image resolution is small."
    ,
                                      "The Image resolution is too small: ({res})"
    ,
                                      "Make sure that the image quality is good."
    ,
                                      "Images"
    );

               
    internal static void AddImageTooLargeViolation(CrawlerProcessContext context, string details) {
                   
    context.Violations.Add(new Violation(ImageTooLarge,
                           
    0, "details", details));
               
    }

               
    internal static void AddImagePixelFormatSmall(CrawlerProcessContext context) {
                   
    context.Violations.Add(new Violation(ImagePixelFormatSmall, 0));
               
    }

               
    internal static void AddImageResolutionSmall(CrawlerProcessContext context, string resolution) {
                   
    context.Violations.Add(new Violation(ImageResolutionSmall,
                           
    0, "res", resolution));
               
    }
           
    }
       
    }
    }

    As you can see in the BeginAnalysis the module registers three new properties with the Report using the Crawler property. This is only required if you want to provide either a custom text or use it for different type other than a string. Note that current version only allows primitive types like Integer, Float, DateTime, etc.

    During the Process method it first makes sure that it only runs for known content types, then it performs any validations raising a set of custom violations that are defined in the Violations static helper class. Note that we load the content from the Response Stream, which is the property that contains the received from the server. Note that if you were analyzing text the property Response would contain the content (this is based on Content Type, so HTML, XML, CSS, etc, will be kept in this String property).

    Registering it

    When running inside IIS Manager, crawler modules need to be registered as a standard UI module first and then inside their initialization they need to be registered using the IExtensibilityManager interface. In this case to keep the code as simple as possible everything is added in a single file. So add a new file called "RegistrationCode.cs" and include the contents below:

    using System;
    using Microsoft.Web.Management.Client;
    using Microsoft.Web.Management.SEO.Crawler;
    using Microsoft.Web.Management.Server;

    namespace SampleCrawlerModule {
       
    internal class SampleCrawlerModuleProvider : ModuleProvider {
           
    public override ModuleDefinition GetModuleDefinition(IManagementContext context) {
               
    return new ModuleDefinition(Name, typeof(SampleCrawlerModule).AssemblyQualifiedName);
           
    }

           
    public override Type ServiceType {
               
    get { return null; }
           
    }

           
    public override bool SupportsScope(ManagementScope scope) {
               
    return true;
           
    }
       
    }

       
    internal class SampleCrawlerModule : Module {
           
    protected override void Initialize(IServiceProvider serviceProvider, ModuleInfo moduleInfo) {
               
    base.Initialize(serviceProvider, moduleInfo);

               
    IExtensibilityManager em = (IExtensibilityManager)GetService(typeof(IExtensibilityManager));
               
    em.RegisterExtension(typeof(CrawlerModule), new ImageExtension());
           
    }
       
    }
    }

    This code defines a standard UI IIS Manager module and in its client-side initialize method it uses the IExtensibilityManager interface to register the new instance of the Image extension. This will make it visible to the Site Analysis feature.

    Testing it

    To test it we need to add the UI module to Administration.config, that also means that the assembly needs to be registered in the GAC.

    To Strongly name the assembly

    In Visual Studio, you can do this easily by using the menu "Project->Properties", and select the "Signing" tab, check the "Sign the assembly", and choose a file, if you don't have one you can easily just choose New and specify a name.

    After this you can compile and now should be able to add it to the GAC.

    To GAC it

    If you have the SDK's you should be able to call it like in my case:

    "\Program Files\Microsoft SDKs\Windows\v6.0A\bin\gacutil.exe" /if SampleCrawlerModule.dll

     

    (Note, you could also just open Windows Explorer, navigate to c:\Windows\assembly and drag & drop your file in there, that will GAC it automatically).

    Finally to see the right name that should be use in Administration.config run the following command:

    "\Program Files\Microsoft SDKs\Windows\v6.0A\bin\gacutil.exe" /l SampleCrawlerModule

    In my case it displays:

    SampleCrawlerModule, Version=1.0.0.0, Culture=neutral, PublicKeyToken=6f4d9863e5b22f10, …

    Finally register it in Administration.config

    Open Administration.config in Notepad using an elevated instance, find the </moduleProviders> and add a string like the one below but replacing the right values for Version and PublicKeyToken:

          <add name="SEOSample" type="SampleCrawlerModule.SampleCrawlerModuleProvider, SampleCrawlerModule, Version=1.0.0.0, Culture=neutral, PublicKeyToken=6f4d9863e5b22f10" />

    Use it

    After registration you now should be able to launch IIS Manager and navigate to Search Engine Optimization. Start a new Analysis to your Web site. Once completed if there are any violations you will see them correctly in the Violations Summary or any other report. For example see below all the violations in the "Images" category.

    image

    Since we also extended the metadata by including the new fields (Image Width, Image Height, and Image Pixel Format) now you can use them with the Query infrastructure to easily create a report of all the images:

    image

    And since they are standard fields, they can be used in Filters, Groups, and any other functionality, including exporting data. So for example the following query can be opened in the Site Analysis feature and will display an average of the width and height of images summarized by type of image:

    <?xml version="1.0" encoding="utf-8"?>
    <query dataSource="urls">
     
    <filter>
       
    <expression field="ContentTypeNormalized" operator="Begins" value="image/" />
      </
    filter>
     
    <group>
       
    <field name="ContentTypeNormalized" />
      </
    group>
     
    <displayFields>
       
    <field name="ContentTypeNormalized" />
        <
    field name="(Count)" />
        <
    field name="Average(iWidth)" />
        <
    field name="Average(iHeight)" />
      </
    displayFields>
    </query>

    image

    And of course violation details are shown as specified, including Recommendation, Description, etc:

    image

    Summary

    As you can see extending the SEO Toolkit using a Crawler Module allows you to provide additional information, whether Metadata, Violations or Links to any document being processed. This can be used to add support for content types not supported out-of-the box such as PDF, Office Documents or anything else that you need. It also can be used to extend the metadata by writing custom code to wire data from other system into the report giving you the ability to exploit this data using the Query capabilities of Site Analysis.

  • CarlosAg Blog

    Setting up a Reverse Proxy using IIS, URL Rewrite and ARR

    • 23 Comments

    Today there was a question in the IIS.net Forums asking how to expose two different Internet sites from another site making them look like if they were subdirectories in the main site.

    So for example the goal was to have a site: www.site.com expose a www.site.com/company1  and a www.site.com/company2 and have the content from “www.company1.com” served for the first one and “www.company2.com” served in the second one. Furthermore we would like to have the responses cached in the server for performance reasons. The following image shows a simple diagram of this:

    Reverse Proxy Sample 

    This sounds easy since its just about routing or proxying every single request to the correct servers, right? Wrong!!! If it only it was that easy. Turns out the most challenging thing is that in this case we are modifying the structure of the underlying URLs and the original layout in the servers which makes relative paths break and of course images, Stylesheets (css), javascripts and other resources are not shown correctly.

    To try to clarify this, imagine that a user requests using his browser the page at http://www.site.com/company1/default.aspx, and so based on the specification above the request is proxied/routed to http://www.company1.com/default.aspx on the server-side. So far so good, however, imagine that the markup returned by this HTML turns out to have an image tag like “<img src=/some-image.png />”, well the problem is that now the browser will resolve that relative path using the base path on the original request he made which was http://www.site.com/company1/default.aspx resulting in a request for the image at http://www.site.com/some-image.png instead of the right “company1” folder that would be http://www.site.com/company1/some-image.png .

    Do you see it? Basically the problem is that any relative path or for that matter absolute paths as well need to be translated to the new URL structure imposed by the original goal.

    So how do we do it then?

    Enter URL Rewrite 2.0 and Application Request Routing

    URL Rewrite 2.0 includes the ability to rewrite the content of a response as it is getting served back to the client which will allow us to rewrite those links without having to touch the actual application.

    Software Required:


    Steps

    1. The first thing you need to do is enable Proxy support in ARR.
      1. To do that just launch IIS Manager and click the server node in the tree view.
      2. Double click the “Application Request Routing Cache” icon
      3. Select the “Server Proxy Settings…” task in the Actions panel
      4. And Make sure that “Enable Proxy” checkbox is marked. What this will do is allow any request in the server that is rewritten to a server that is not the local machine will be routed to the right place automatically without any further configuration.
    2. Configure URL Rewrite to route the right folders and their requests to the right site. But rather than bothering you with UI steps I will show you the configuration and then explain step by step what each piece is doing.
    3. Note that for this post I will only take care of Company1, but you can imagine the same steps apply for Company2, and to test this you can just save the configuration file below as web.config and save it in your inetpub\wwwroot\  or in any other site root and you can test it.
    <?xml version="1.0" encoding="UTF-8"?>
    <configuration>
       
    <system.webServer>
           
    <rewrite>
               
    <rules>
                   
    <rule name="Route the requests for Company1" stopProcessing="true">
                       
    <match url="^company1/(.*)" />
                        <
    conditions>
                           
    <add input="{CACHE_URL}" pattern="^(https?)://" />
                        </
    conditions>
                       
    <action type="Rewrite" url="{C:1}://www.company1.com/{R:1}" />
                        <
    serverVariables>
                           
    <set name="HTTP_ACCEPT_ENCODING" value="" />
                        </
    serverVariables>
                   
    </rule>
               
    </rules>
               
    <outboundRules>
                   
    <rule name="ReverseProxyOutboundRule1" preCondition="ResponseIsHtml1">
                       
    <match filterByTags="A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script" pattern="^http(s)?://www.company1.com/(.*)" />
                        <
    action type="Rewrite" value="/company1/{R:2}" />
                    </
    rule>
                   
    <rule name="RewriteRelativePaths" preCondition="ResponseIsHtml1">
                       
    <match filterByTags="A, Area, Base, Form, Frame, Head, IFrame, Img, Input, Link, Script" pattern="^/(.*)" negate="false" />
                        <
    action type="Rewrite" value="/company1/{R:1}" />
                    </
    rule>
                   
    <preConditions>
                       
    <preCondition name="ResponseIsHtml1">
                           
    <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
                        </
    preCondition>
                   
    </preConditions>
               
    </outboundRules>
           
    </rewrite>
       
    </system.webServer>
    </configuration>

    Setup the Routing

                    <rule name="Route the requests for Company1" stopProcessing="true">
                       
    <match url="^company1/(.*)" />
                        <
    conditions>
                           
    <add input="{CACHE_URL}" pattern="^(https?)://" />
                        </
    conditions>
                       
    <action type="Rewrite" url="{C:1}://www.company1.com/{R:1}" />
                        <
    serverVariables>
                           
    <set name="HTTP_ACCEPT_ENCODING" value="" />
                        </
    serverVariables>
                   
    </rule>

    The first rule is an inbound rewrite rule that basically captures all the requests to the root folder /company1/*, so if using Default Web Site, anything going to http://localhost/company1/* will be matched by this rule and it will rewrite it to www.company1.com respecting the HTTP vs HTTPS traffic.

    One thing to highlight which is what took me a bit of time is the “serverVariables” entry in that rule that basically is overwriting the Accept-Encoding header, the reason I do this is because if you do not remove that header then the response will likely be compressed (Gzip or deflate) and Output Rewriting is not supported on that case, and you will end up with an error message like:

    HTTP Error 500.52 - URL Rewrite Module Error.
    Outbound rewrite rules cannot be applied when the content of the HTTP response is encoded ("gzip").

    Also note that to be able to use this feature for security reasons you need to explicitly enable this by allowing the server variable. See enabling server variables here.

     

    Outbound Rewriting to fix the Links

    The last two rules just rewrite the links and scripts and other resources so that the URLs are translated to the right structure. The first one rewrites absolute paths, and the last one rewrites the relative paths. Note that if you use relative paths using “..” this will not work, but you can easily fix the rule above, I was too lazy to do that and since I never use those when I create a site it works for me :)

    Setting up Caching for ARR

    A huge added value of using ARR is that now we can with a couple of clicks enable disk caching so that the requests are cached locally in the www.site.com, so that not every single request ends up paying the price to go to the backend servers.

    1. To do that just launch IIS Manager and click the server node in the tree view.
    2. Double click the “Application Request Routing Cache” icon
    3. Select the “Add Drive…” task in the Actions panel.
    4. Specify a directory where you want to keep your cache. Note that this can be any subfolder in your system.
    5. Make sure that “Enable Disk Cache” checkbox is marked in the Server Proxy Settings mentioned above.

    As easy as that now you will see caching working and your site will act as a container of other servers in the internet. Pretty cool hah! :)

    So in this post we saw how with literally few lines of XML, URL Rewrite and ARR we were able to enable a proxy/routing scenario with the ability to rewrite links and furthermore with caching support.

  • CarlosAg Blog

    Announcing: IIS Search Engine Optimization Toolkit Beta 1

    • 20 Comments

    Today we are releasing the IIS Search Engine Optimization Toolkit. The IIS SEO Toolkit is a set of features that aim to help you keep your Web site and its content in good shape for both Users and Search Engines.

    The features that are included in this Beta release include:

    • Site Analysis. This feature includes a crawler that starts looking at your Web site contents, discovering links, downloading the contents and applying a set of validation rules aimed to help you easily troubleshoot common problems such as broken links, duplicate content, keyword analysis, route analysis and many more features that will help you improve the overall quality of your Web site.
    • Robots Exclusion Editor. This includes a powerful editor to author Robots Exclusion files. It can leverage the output of a Site Analysis crawl report and allow you to easily add the Allow and Disallow entries without having to edit a plain text file, making it less error prone and more reliable. Furthermore, you can run the Site Analysis feature again and see immediately the results of applying your robots files.
    • Sitemap and Sitemap Index Editor. Similar to the Robots editor, this allows you to author Sitemap and Sitemap Index files with the ability to discover both physical and logical (Site Analysis crawler report) view of your Site.

    Checkout the great blog about IIS SEO Toolkit by ScottGu, or this IIS SEO simple video of some of its capabilities.

    Run it in your Development, Staging, or Production Environments

    One of the problems with many similar tools out there is that they require you to publish the updates to your production sites before you can even use the tools, and of course would never be usable for Intranet or internal applications that are not exposed to the Web. The IIS Search Engine Optimization Toolkit can be used internally in your own development or staging environments giving you the ability to clean up the content before publishing to the Web. This way your users do not need to pay the price of broken links once you publish to the Web and you will not need to wait for those tools or Search Engines to crawl your site to finally discover you broke things.

    For developers this means that they can now easily look at the potential impact of removing or renaming a file, easily check which files are referring to this page and which files he can remove because of only being referenced by this page.

    Run it against any Web application built on any framework running in any server

    One thing that is important to clarify is that you can target and analyze your production sites if you want to, and you can target Web applications running in any platform, whether its ASP.NET, PHP, or plain HTML text files running in your local IIS or on any other remote server.

    Bottom line, try it against your Web site, look at the different features and give us feedback for additional reports, options, violations, content to parse, etc, post any comments or questions at the IIS Search Engine Optimization Forum.

    The IIS SEO Toolkit documentation can be found at http://learn.iis.net/page.aspx/639/using-iis-search-engine-optimization-toolkit/, but remember this is only Beta 1 so we will be adding more features and content.

    IIS Search Engine Optimization Toolkit

  • CarlosAg Blog

    The new Configuration System in IIS 7

    • 19 Comments
    Today I was planning on talking about the configuration classes that I purposedly skipped in my last post, but I realized it would be better to explain a little bit more about the new configuration system used in IIS 7.

    First of all, many of you (as me) will be extremely happy to know that the old "monolithic-centralized-admin only" metabase is dead, we have got rid of it for a much better configuration store. Now, before you feel panic, let me assure you that we haven’t just killed it and forget about the thousands of lines of scripts or custom tools built using the old metabase API’s (such as ABO), for that we have created something we called ABOMapper which will allow all of those applications to keep running transparently, since it will auto-magically translate the old calls to the metabase to actually modify the new configuration system.

    So what is this new configuration system? Well for those of you who have been working with ASP.NET for the past years, you will feel right at home and happy to know that we are moving to used the exact same concept as ASP.NET does .config files.

    ApplicationHost.config

    At the root level we have a file called ApplicationHost.config that lives in the same directory of IIS (typically <windows>\System32\InetSrv\ directory). This is the main configuration file for IIS, this is where we store things like the list of sites, applications, virtual directories, general settings, logging, caching, etc.

    This file has two main groups of settings:
    • system.applicationHost: Contains all the settings for the activation service, basically things like the list of application pools, the logging settings, the listeners and the sites. These settings are centralized and can only be defined within applicationHost.config.
    • system.webServer: Contains all the settings for the Web server, such as the list of modules and isapi filters, asp, cgi and others. These settings can be set in applicationHost.config as well as any web.config (provided the Override Mode settings are set to allow)
    ApplicationHost.config

    Administration.config

    This is also a file located in the IIS directory where we store delegation settings for the UI, including the list of modules (think of it as a UI Add-in) available, and other things like administrators.
    Administration.config

    Web.config

    Finally the same old web.config from asp.net has gotten smarter and now you will be able to include server settings along with your asp.net settings.


    Why is this important?

    Well, as I said at the beginning the old metabase could only be accessed by administrators, so in order for someone to change a settings as simple as the default document for a specific application (say you want to change it to be index.aspx), you would need to be an administrator or call the administrator to do the changes.
    With this new distributed configuration system I can now safely modify the web.config within my application and have it my own way without disturbing anyone else. Furthermore, since it lives in my own web.config along with the content of my application I can safely XCopy the whole application and now even the web server settings are ready. No longer the case of going to InetMgr and start setting everything manually or creating a bunch of scripts to do that.

    So how does this actually looks like:


    In applicationHost.config my Sites section looks as follows:
        <sites>
            
    <site name="Default Web Site" id="1">
                
    <application path="/" applicationPool="DefaultAppPool">
                    
    <virtualDirectory path="/" physicalPath="c:\inetpub\wwwroot" />
                </
    application>
                
    <bindings>
                    
    <binding protocol="HTTP" bindingInformation="*:80:" />
                </
    bindings>
             
    </site>
        
    </sites>
    This basically defines a site that has a root application with a virtual directory that points to \inetpub\wwwroot. This site is listening on any IP address on port 80.
    Say I wanted to add a new application and make it listen also in port 8080.
        <sites>
            
    <site name="Default Web Site" id="1">
                
    <application path="/" applicationPool="DefaultAppPool">
                    
    <virtualDirectory path="/" physicalPath="c:\inetpub\wwwroot" />
                </
    application>
                
    <application path="/MyApp" applicationPool="DefaultAppPool">
                   
    <virtualDirectory path="/" physicalPath="d:\MyApp" />
                </
    application>

                
    <bindings>
                    
    <binding protocol="HTTP" bindingInformation="*:80:" />
                    <
    binding protocol="HTTP" bindingInformation="*:8080:" />
                </
    bindings>
             
    </site>
        
    </sites>
    Just by adding the previous markup, I can now browse to http://localhost:8080/MyApp
    IIS Settings in web.config
    More interesting I can now add a file called web.config to c:\MyApp\web.config, and set the content to be:
    <configuration>
        
    <system.webServer>
            
    <defaultDocument>
                
    <files>
                    
    <clear />
                    <
    add value="Index.aspx" />
                </
    files>
            
    </defaultDocument>
        
    </system.webServer>
    </configuration>
    And with this change, my application now will respond using index.aspx whenever /MyApp is requested.
    You can extrapolate from this that all the IIS settings for your application including authentication, authorization, asp and cgi settings, the list of modules, custom errors, etc can be configured within your web.config and never have to request changes to administrators again.

    Of course this brings the question, isn’t this insecure? The answer is no, by default all the IIS sections (except DefaultDocuments) is locked at the applicationHost.config, meaning no one can change them within their web.config unless explicitly changed by the administrator. The cool thing is that the administrator can change it and customize it per application allowing certain apps to change settings while preventing others from doing it. All this can be done through plain config using Notepad or using the very cool NEW InetMgr (which I will blog about it later)
    Finally, the following image shows the hierarchy of config files for each url. config hierarchy

    Now that I have shown a high level overview of how configuration works in IIS 7, I will finally blog about the API to actually change this settings programmatically using Managed code and Microsoft.Web.Administration.dll
  • CarlosAg Blog

    Analyze your IIS Log Files - Favorite Log Parser Queries

    • 14 Comments

    The other day I was asked if I knew about a tool that would allow users to easily analyze the IIS Log Files, to process and look for specific data that could easily be automated. My recommendation was that if they were comfortable with using a SQL-like language that they should use Log Parser. Log Parser is a very powerful tool that provides a generic SQL-like language on top of many types of data like IIS Logs, Event Viewer entries, XML files, CSV files, File System and others; and it allows you to export the result of the queries to many output formats such as CSV (Comma-Separated Values, etc), XML, SQL Server, Charts and others; and it works well with IIS 5, 6, 7 and 7.5.

    To use it you just need to install it and use the LogParser.exe that is found in its installation directory (on my x64 machine it is located at: C:\Program Files (x86)\Log Parser 2.2).

    I also thought on sharing some of my favorite queries. To run them, just execute LogParser.exe and make sure to specify that the input is an IIS Log file (-i:W3C) and for ease of use in this case we will export to a CSV file that can be then opened in Excel (-o:CSV) for further analysis:

    LogParser.exe -i:W3C "Query-From-The-Table-Below" -o:CSV
    Purpose Query Sample Output
    Number of Hits per Client IP, including a Reverse DNS lookup (SLOW) SELECT c-ip As Machine, 
           
    REVERSEDNS(c-ip) As Name, 
           
    COUNT(*) As Hits 
     
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
     
    GROUP BY Machine ORDER BY Hits DESC
    Machine Name Hits
    ::1 CARLOSAGDEV 57
    127.0.0.1 MACHINE1 28
    127.X.X.X MACHINE2 1
    Top 25 File Types SELECT TOP 25 
       
    EXTRACT_EXTENSION(cs-uri-stem) As Extension, 
       
    COUNT(*) As Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    GROUP BY Extension 
    ORDER BY Hits DESC
    Extension Hits
    gif 52127
    bmp 20377
    axd 10321
    txt 460
    htm 362
    Top 25 URLs SELECT TOP 25 
       
    cs-uri-stem as Url, 
       
    COUNT(*) As Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    GROUP BY cs-uri-stem 
    ORDER By Hits DESC
    Url Hits
    /WebResource.axd 10318
    /favicon.ico 8523
    /Tools/CodeTranslator/Translate.ashx 6519
    /App_Themes/Silver/carlosag.css 5898
    /images/arrow.gif 5720
    Number of hits per hour for the month of March SELECT 
       
    QUANTIZE(TO_LOCALTIME(TO_TIMESTAMP(date, time)), 3600) AS Hour, 
       
    COUNT(*) AS Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    WHERE date>'2010-03-01' and date<'2010-04-01' 
    Group By Hour
    Hour   Hits
    3/3/2010 10:00:00 33
    3/3/2010 11:00:00 5
    3/3/2010 12:00:00 3
    Number of hits per Method (GET, POST, etc) SELECT 
       
    cs-method As Method, 
       
    COUNT(*) As Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    GROUP BY Method
    Method Hits
    GET 133566
    POST 10901
    HEAD 568
    OPTIONS 11
    PROPFIND 18
    Number of requests made by user SELECT TOP 25 
       
    cs-username As User, 
       
    COUNT(*) as Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    WHERE User Is Not Null 
    GROUP BY User
    User Count
    Administrator 566
    Guest 1
    Extract Values from Query String (d and t) and use them for Aggregation SELECT TOP 25 
       
    EXTRACT_VALUE(cs-uri-query,'d') as Query_D, 
       
    EXTRACT_VALUE(cs-uri-query,'t') as Query_T, 
       
    COUNT(*) As Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    WHERE Query_D IS NOT NULL 
    GROUP BY Query_D, Query_T 
    ORDER By Hits DESC
    Query_D Query_T Hits
    Value in Query1 Value in T1 1556
    Value in Query2 Value in T2 938
    Value in Query3 Value in T3 877
    Value in Query4 Value in T4 768
    Find the Slowest 25 URLs (in average) in the site SELECT TOP 25 
       
    cs-uri-stem as URL, 
       
    MAX(time-taken) As Max, 
       
    MIN(time-taken) As Min, 
       
    Avg(time-taken) As Average 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    GROUP BY URL 
    ORDER By Average DESC
    URL Max Min Average
    /Test/Default.aspx 23215 23215 23215
    /WebSite/Default.aspx 5757 2752 4178
    /Remote2008.jpg 3510 3510 3510
    /wordpress/ 6541 2 3271
    /RemoteVista.jpg 3314 2 1658
    List the count of each Status and Substatus code SELECT TOP 25 
       
    STRCAT(TO_STRING(sc-status), 
       
    STRCAT('.', TO_STRING(sc-substatus))) As Status, 
       
    COUNT(*) AS Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    GROUP BY Status 
    ORDER BY Status ASC
    Status Hits
    200 144
    304 38
    400 9
    403.14 10
    404 64
    404.3 2
    500.19 23
    List all the requests by user agent SELECT 
       
    cs(User-Agent) As UserAgent, 
       
    COUNT(*) as Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    GROUP BY UserAgent 
    ORDER BY Hits DESC
    UserAgent Hits
    iisbot/1.0+(+http://www.iis.net/iisbot.html) 104
    Mozilla/4.0+(compatible;+MSIE+8.0;… 77
    Microsoft-WebDAV-MiniRedir/6.1.7600 23
    DavClnt 1
    List all the Win32 Error codes that have been logged SELECT 
       
    sc-win32-status As Win32-Status, 
       
    WIN32_ERROR_DESCRIPTION(sc-win32-status) as Description, 
       
    COUNT(*) AS Hits 
    FROM c:\inetpub\logs\LogFiles\W3SVC1\* 
    WHERE Win32-Status<>0 
    GROUP BY Win32-Status 
    ORDER BY Win32-Status ASC
    Win32-Status Description Hits
    2 The system cannot find the file specified. 64
    13 The data is invalid. 9
    50 The request is not supported. 2

    A final note: any time you deal with Date and Time, remember to use the TO_LOCALTIME function to convert the log times to your local time, otherwise you will find it very confusing when your entries seem to be reported incorrectly.

    If you need any help you can always visit the Log Parser Forums to find more information or ask specific questions.

    Any other useful queries I missed?

  • CarlosAg Blog

    IIS Admin Pack Technical Preview 2 Released

    • 13 Comments

    Today we are releasing the Technical Preview 2 of the IIS Admin Pack, it is an update of the release we made on February.

    Install the Admin Pack and Database Manager today!

    Admin Pack (x86):  http://www.iis.net/downloads/default.aspx?tabid=34&i=1682&g=6

    Database Manager (x86):  http://www.iis.net/downloads/default.aspx?tabid=34&g=6&i=1684

    Admin Pack (x64): http://www.iis.net/downloads/default.aspx?tabid=34&i=1683&g=6

    Database Manager (x64):  http://www.iis.net/downloads/default.aspx?tabid=34&g=6&i=1685

    New Features:

    There are a lot of interesting features we've added to almost every component for this release:

    Database Manager

    1. Specify your own Connections. We heard during TP1 that it was a very important feature to specify your own database connection information without us automatically reading them from Connection Strings, with TP2 now you can do it. We've also added a switch that allows administrators the ability to disable this feature in case they have concerns and want to still enforce the "only read the connectionStrings from config" functionality and prevent users from trying to add their own.
    2. Extensibility. In these release we are making the API's public to write your own Database Provider that allows you to plugin your own database and reuse all of our UI and remoting, all you need to do is implement a few functions (CreateTable, DeleteTable, InsertRow, DeleteRow, etc) and your provider will be ready to use remotely over HTTPS to manage your own DB.
    3. Support for MySQL. In the upcoming weeks we will be releasing support for MySQL using the extensibility model mentioned above.
    4. Small things. New Toolbar in the Connections Panel to simplify discovery of commands
    5. Use of SMO. In this release we are using SQL Server Management Objects for all the schema manipulation, this means that things like "scripts exported from SQL inluding 'GO statements' and others will work in the Query window"

    Configuration Editor

    1. Choose where Configuration is read: now allows you to specify where you want to read and write your configuration from. This feature is great for advanced users that really understand the inheritance of our distributed configuration system and want to take advantage of it. Now ehen you go to a site, application or anywhere else, you will by default have the same experience where we read configuration from the deepest configuration path, however, now you can use the "From:" Combo box and tell us where you really want to read the configuration from, for example the following image shows how the options look like for a folder underneath Default Web Site. As you can see now you can choose if you want to use locationPath's or go directly to the web.config. This plays really nice with locking allowing you to lock items for delegated users, but still be able to change things yourself pretty easily. This change also works with script generation so now when you generate your scripts you can customize where to read/write configuration.
    2. Lots of small things:Now, all the changes you perform will be show bolded untill you commit them. Enhanced the locking functionality to better reflect when attributes/elements are locked. Several minor bug fixes for script generation, collections without keys, etc.

    IIS Reports

    1. No longer depends on LogParser. TP1 was using LogParser for parsing logs. This version no longer uses LogParser which menas no additional installs for this. We also measured performance and we see an increase of up to 40% which means faster reports. (In my machine for logs of up to 6.4GB or 17 million rows it takes about 2 minutes to generate a report, wee see about 5-7 seconds for 1 million rows)
    2. Better Reports. We took a look at the set of reports and we extended the list of reports as well as added new filters for them, for example, the status code report now lets you drill down and see which URL's generated a particular Status code, etc.
    3. Export to CSV and XML.
    4. Extensibility: For this release just as for DBManager we've made the API's of the UI public so that you can extend and add your own set of reports by writing a class derived from either TableReportDefinition or ChartReportDefinition. This means that just by overriding a method a returning a DataTable, we will take care of the formatting, adding a Chart and letting your users export in HTML, MHTML, CSV, XML, etc.

     UI Extensions

    1. Bug fixes for all the features like Request Filtering, FastCGI, ASP.NET Authorization Rules, and ASP.NET Error Pages.

    As you can see the extensibility is a big theme now, and in my following posts I will be showing how to extend both IIS Reports as well as DBManager.

Page 1 of 10 (91 items) 12345»