My last post talked about the Technical Preview release of the IIS 7.0 Admin Pack, and how it includes 7 new features that will help you manage your IIS 7.0.
Today I was going to start writing about more details about each feature and Bill Staples just posted something (How to (un)block directories with IIS7 web.config) that almost seems that it was planned for me to introduce one of the features in the Admin Pack, namely Request Filtering UI.
IIS 7.0 includes a feature called Request Filtering that provides additional capabilities to secure your web server, for example it will let you filter requests that are double escaped, or filter requests that are using certain HTTP Verbs, or even block requests to specific "folders", etc. I will not go into the details on this functionality, if you want to learn more about it you can see the Request Filtering articles over http://learn.iis.net
In his blog Bill mentions how you can easily configure Request Filtering using any text editor, such as notepad, and edit the web.config manually. That was required since we did not ship UI within IIS Manager for it due to time constraints and other things. But now as part of the Admin Pack we are releasing UI for managing the Request Filtering settings.
Following what Bill just showed in his blog, this is the way you would do it using the new UI instead.
1) Install IIS Admin Pack (Technical Preview)
2) Launch IIS Manager
3) Drill down using the Tree View to the site or application you want to change the settings for.
4) Enter into the new feature called Request Filtering inside the IIS category
5) Select the Hidden Segments and choose "Add Hidden Segment" from the Task List on the right
6) Add the item
As you would expect the outcome is exactly as Bill explained in his blog, just an entry within you web.config, something like:
So as you can see the Request Filtering UI will help you discover some of the nice security settings that IIS 7.0 has. The following images show some of the additional settings you can configure, such as Verbs, Headers, URL Sequences, URL Length, Quey String size, etc.
In the URL Rewrite forum somebody posted the question "are redirects bad for search engine optimization?". The answer is: not necessarily, Redirects are an important tool for Web sites and if used in the right context they actually are a required tool. But first a bit of background.
A redirect in simple terms is a way for the server to indicate to a client (typically a browser) that a resource has moved and they do this by the use of an HTTP status code and a HTTP location header. There are different types of redirects but the most common ones used are:
Below is an example on the response sent from the server when requesting http://www.microsoft.com/SQL/
One of the most important factors in SEO is the concept called organic linking, in simple words it means that your page gets extra points for every link that external Web sites have linking to your page. So now imagine the Search Engine Bot is crawling an external Web site and finds a link pointing to your page (example.com/some-page) and when it tries to visit your page it runs into a redirect to another location (say example.com/somepage). Now the Search Engine has to decide if it should add the original "some-page" into its index as well as if it should "add the extra points" to the new location or to the original location, or if it should just ignore it entirely. Well the answer is not that simple, but a simplification of it could be:
IIS Search Optimization Toolkit has a couple of rules that look for different patterns related to Redirects. The Beta version includes the following:
So how does it look like? In the image below I ran Site Analysis against a Web site and it found a few of these violations (2 and 3).
Notice that when you double click the violations it will tell you the details as well as give you direct access to the related URL's so that you can look at the content and all the relevant information about them to make the decision. From that menu you can also look at which other pages are linking to the different pages involved as well as launch it in the browser if needed.
Similarly with all the other violations it tries to explain the reason it is being flagged as well as recommended actions to follow for each of them.
IIS Search Engine Optimization Toolkit can also help you find all the different types of redirects and the locations where they are being used in a very easy way, just select Content->Status Code Summary in the Dashboard view and you will see all the different HTTP Status codes received from your Web site. Notice in the image below how you can see the number of redirects (in this case 18 temporary redirects and 2 permanent redirects). You can also see how much content they accounted for, in this case about 2.5 kb (Note that I've seen Web sites generate a large amount of useless content in redirect traffic, speaking of spending in bandwidth). You can double click any of those rows and it will show you the details of the URL's that returned that and from there you can see who links to them, etc.
So going back to the original question: "are redirects bad for Search Engine Optimization?". Not necessarily, they are an important tool used by Web application for many reasons such as:
Just make sure you don't abuse them by having redirects to redirects, unnecessary redirects, infinite loops, and use the right semantics.
During this PDC I attended Ian's presentation about WPF and Silverlight where he demonstrated the high degree of compatibility that can be achieved between a WPF desktop application and a Silverlight application. One of the differences that he demonstrated was when your application consumed Web Services since Silverlight applications execute in a sandboxed environment they are not allowed to call random Web Services or issue HTTP requests to servers that are not the originating server, or a server that exposes a cross-domain manifest stating that it is allowed to be called by clients from that domain.
Then he moved to show how you can work around this architectural difference by writing your own Web Service or HTTP end-point that basically gets the request from the client and using code on the server just calls the real Web Service. This way the client sees only the originating server and it allows the call to succeed, and the server can freely call the real Web Service. Funny enough while searching for a Quote Service I ran into an article from Dino Esposito in MSDN magazine where he explains the same issue and also exposes a "Compatibility Layer" which again is just code (more than 40 lines of code) to act as proxy to call a Web Service (except he uses the JSON serializer to return the values).
The obvious disadvantage is that this means you have to write code that only forwards the request and returns the response acting essentially as a proxy. Of course this can be very simple, but if the Web Service you are trying to call has any degree of complexity where custom types are being sent around, or if you actually need to consume several methods exposed by it, then it quickly becomes a big maintenance nightmare trying to keep them in sync when they change and having to do error handling properly, as well as dealing with differences when reporting network issues, soap exceptions, http exceptions, etc.
So after looking at this, I immediately thought about ARR (Application Request Routing) which is a new extension for IIS 7.0 (see http://www.iis.net/extensions) that you can download for free from IIS.NET for Windows 2008, that among many other things is capable of doing this kind of routing without writing a single line of code.
This blog tries to show how easy it is to implement this using ARR. Here are the steps to try this: (below you can find the software required), note that if you are only interested in what is really new just go to 'Enter ARR' section below to see the configuration that fix the Web Service call.
Message: Unhandled Error in Silverlight 2 Application An exception occurred during the operation, making the result invalid.
One of the features offered by ARR is to provide proxy functionality to forward requests to another server. One of the scenarios where this functionality is useful is when using it from clients that cannot make calls directly to the real data, this includes Silverlight, Flash and AJAX applications. As shown in this blog, by just using a few lines of XML configuration you can enable clients to call services in other domains without having to write hundreds of lines of code for each method. It also means that I get the original data and that if the WSDL were to change I do not need to update any wrappers. Additionally if using REST based services you could use local caching in your server relying on Output Cache and increase the performance of your applications significantly (again with no code changes).
Here is the software I installed to do this sample(amazing that all of it is completely free):
IIS 7.0 includes a very cool feature that is not so well known called Hostable WebCore (HWC). This feature basically allows you to host the entire IIS functionality within your own process. This gives you the power to implement scenarios where you can customize entirely the functionality that you want "your Web Server" to expose, as well as control the lifetime of it without impacting any other application running on the site. This provides a very nice model for automating tests that need to run inside IIS in a more controlled environment.
This feature is implemented in a DLL called hwebcore.dll, that exports two simple methods:
The real trick for this feature is to know exactly what you want to support and "craft" the IIS Server configuration needed for different workloads and scenarios, for example:
An interesting thing to mention is that the file passed to ApplicationHostConfigPath parameter is live, in the sense that if you change the configuration settings your "in-process-IIS" will pick up the changes and apply them as you would expect to. In fact even web.config's in the site content or folder directories will be live and you'll get the same behavior.
To show how easy this can be done I wrote a small simple class to be able to run it easily from managed code. To consume this, you just have to do something like:
This will start your very own "copy" of IIS running in your own process, this means that you can control which features are available as well as the site and applications inside it without messing with the local state of the machine.
A very interesting thing is that it will even run without administrator privileges, meaning any user in the machine can start this program and have a "web server" of their own, that they can recycle, start and stop at their own will. (Note that this non-administrative feature requires Vista SP1 or Windows Server 2008, and it only works if the binding will be a local binding, meaning no request from outside the machine).
You can download the entire sample which includes two configurations: 1) one that runs only an anonymous static file web server that can only download HTML and other static files, and 2) one that is able to run ASP.NET pages as well.
Download the entire sample source code (9 kb)
You might be asking why would I even care to have my own IIS in my executable and not just use the real one? Well there are several scenarios for this:
In future posts I intent to share more samples that showcase some of this cool stuff.
IIS 7.0 Hostable WebCore feature allows you to host a "copy" of IIS in your own process. This is not your average "HttpListener" kind of solution where you will need to implement all the functionality for File downloads, Basic/Windows/Anonymous Authentication, Caching, Cgi, ASP, ASP.NET, Web Services, or anything else you need; Hostable WebCore will allow you to configure and extend in almost any way the functionality of your own Web Server without having to build any code.
With the upcoming release of .NET 3.5 and LINQ I thought it would be interesting to show some of the cool things you can do with IIS 7 and LINQ. Everything that I will do can be done with C# 2.0 code but it would take me several lines of code to write them but thanks to LINQ you can do them in about a line or two.
Let's start with a very basic example that does not use LINQ but just M.W.A (Microsoft.Web.Administration) and then start adding interesting things to it.
The following code just iterates the sites in IIS and displays their name.
In my spare time I’ve been thinking about new ideas for the SEO Toolkit, and it occurred to me that rather than continuing trying to figure out more reports and better diagnostics against some random fake sites, that it could be interesting to ask openly for anyone that is wanting a free SEO analysis report of your site and test drive some of it against real sites.
So if you want in, just post me your URL in the comments of this blog (make sure you are reading this blog from a URL inside http://blogs.msdn.com/carlosag/ , otherwise you might be posting comments in some syndicating site.), I will only allow the first few sites (if successful I will start another batch in the future) and I will be doing one by one in the following days. Make sure to include a way to contact you whether using the MSDN user infrastructure or include an email so that I can contact you with the results.
Alternatively I will take also URLs using Twitter at http://twitter.com/CarlosAguilarM so hurry up and let me know if you want me to look at your site.
One question that I've been asked several times is: "Is it possible to schedule the IIS SEO Toolkit to run automatically every night?". Other related questions are: "Can I automate the SEO Toolkit so that as part of my build process I'm able to catch regressions on my application?", or "Can I run it automatically after every check-in to my source control system to ensure no links are broken?", etc.
The good news is that the answer is YES!. The bad news is that you have to write a bit of code to be able to make it work. Basically the SEO Toolkit includes a Managed code API to be able to start the analysis just like the User Interface does, and you can call it from any application you want using Managed Code.
In this blog I will show you how to write a simple command application that will start a new analysis against the site provided in the command line argument and process a few queries after finishing.
The most important type included is a class called WebCrawler. This class takes care of all the process of driving the analysis. The following image shows this class and some of the related classes that you will need to use for this.
The WebCrawler class is initialized through the configuration specified in the CrawlerSettings. The WebCrawler class also contains two methods Start() and Stop() which starts the crawling process in a set of background threads. With the WebCrawler class you can also gain access to the CrawlerReport through the Report property. The CrawlerReport class represents the results (whether completed or in progress) of the crawling process. It has a method called GetUrls() that returns an instance to all the UrlInfo items. A UrlInfo is the most important class that represents a URL that has been downloaded and processed, it has all the metadata such as Title, Description, ContentLength, ContentType, and the set of Violations and Links that it includes.
If you are not using Visual Studio, you can just save the contents above in a file, call it SEORunner.cs and compile it using the command line:
C:\Windows\Microsoft.NET\Framework\v3.5\csc.exe /r:"c:\Program Files\Reference Assemblies\Microsoft\IIS\Microsoft.Web.Management.SEO.Client.dll" /optimize+ SEORunner.cs
After that you should be able to run SEORunner.exe and pass the URL of your site as a argument, you will see an output like:
Processed - Remaining - Download Size
56 - 149 - 0.93 MB
127 - 160 - 2.26 MB
185 - 108 - 3.24 MB
228 - 72 - 4.16 MB
254 - 48 - 4.98 MB
277 - 36 - 5.36 MB
295 - 52 - 6.57 MB
323 - 25 - 7.53 MB
340 - 9 - 8.05 MB
358 - 1 - 8.62 MB
362 - 0 - 8.81 MB
Start URL: http://www.carlosag.net/
Start Time: 11/16/2009 12:16:04 AM
End Time: 11/16/2009 12:16:15 AM
Status Code summary
OK - 319
MovedPermanently - 17
Found - 23
NotFound - 2
InternalServerError - 1
The most interesting method above is RunAnalysis, it creates a new instance of the CrawlerSettings and specifies the start URL. Note that it also specifies that we should consider internal all the pages that are hosted in the same directory or subdirectories. We also set the a unique name for the report and use the same directory as the IIS SEO UI uses so that opening IIS Manager will show the reports just as if they were generated by it. Then we finally call Start() which will start the number of worker threads specified in the WebCrawler::WorkerCount property. We finally just wait for the WebCrawler to be done by querying the IsRunning property.
The remaining methods just leverage LINQ to perform a few queries to output things like a report aggregating all the URLs processed by Status code and more.
As you can see the IIS SEO Toolkit crawling APIs allow you to easily write your own application to start the analysis against your Web site which can be easily integrated with the Windows Task Scheduler or your own scripts or build system to easily allow for continuous integration.
Once the report is saved locally it can then be opened using IIS Manager and continue further analysis as with any other report. This sample console application can be scheduled using the Windows Task Scheduler so that it can run every night or at any time. Note that you could also write a few lines of PowerShell to automate it without the need of writing C# code and do that by only command line, but that is left for another post.
I was running out of disk space in C: and was unable to install a small software that I needed, so I decided to clean up a bit. For that I like using WinDirStat http://windirstat.info/ which very quickly allows you to find where the big files/folders are. In this case I found that my c:\Windows\winsxs folder was over 12 GB of size. One way to reclaim some of that disk space is to cleanup all files that have been backed up when a Service Pack has been installed. To do that in Windows 7 you can run the following DISM command:
dism /online /cleanup-image /spsuperseded /hidesp
That freed up 4 GB in my machine and now I can move on.
Disclaimer: I only ran this in my Windows 7 machine and it worked great, have not tried it in Server SKUs so run at your own risk.
A few weeks ago my team released the version 2.0 of the URL Rewrite for IIS. URL Rewrite is probably the most powerful Rewrite engine for Web Applications. It gives you many features including Inbound Rewriting (ie. Rewrite the URL, Redirect to another URL, Abort Requests, use of Maps, and more), and in Version 2.0 it also includes Outbound Rewriting so that you can rewrite URLs or any markup as the content is being sent back even if its generated using PHP, ASP.NET or any other technology.
It also includes a very powerful User Interface that allows you to test your regular expressions and even better it includes a set of templates for common types of Rules. Some of those rules are incredibly valuable for SEO (Search Engine Optimization) purposes. The SEO rules are:
For more information on the SEO Templates look at: http://learn.iis.net/page.aspx/806/seo-rule-templates/
What is really cool is that you can use the SEO Toolkit to run it against your application and you probably will get some violations around lower-case, or canonical domains, etc. And after seeing those you can use URL Rewrite 2.0 to fix them with one click.
I have personally used it in my Web site, try the following three URLs and all of them will be redirected to the canonical form (http://www.carlosag.net/Tools/CodeTranslator/) and you will see URL Rewrite in action:
Note that at the end those templates just translate to web.config settings that become part of your application that can be XCOPY with it. This works with ASP.NET, PHP, or any other server technology including static files. Below is the output of the Canonical Host Name rule which I use on my Web site’s web.config.
There are many more features that I could talk, but for now this was just a quick SEO related post.
I have just uploaded a new application that extends IIS Manager 7 for Windows Vista and Windows Longhorn Server that adds a new Reports option that gives you a few reports of the server and site activity. Its features include:
Click Here to go to the Download Page
I'm working on a second version that will allow you to create your own queries and configure more options like Chart settings, and ore.
If you have any suggestions on reports that would be useful feel free to add them as comment to the post.