Are you an developer/owner/publisher/etc of a site that uses HTTPS (SSL) for secure access? If you are, please continue to read.
Have you ever visited a Web site that is secured using SSL (Secure Sockets Layer) just to get an ugly Security Warning message like:
Do you want to view only the webpage content that was delivered securely?
This webpage contains content that will not be delivered using a secure HTTPS connection, which could compromise the security of the entire webpage.
How frustrating is this for you? Do you think that end-users know what is the right answer to the question above? Honestly, I think it actually even feels like the Yes/No buttons and the phrasing of the question would cause me to click the wrong option.
What this warning is basically trying to tell the user is that even though he/she navigated to a page that you thought was secured by using SSL, the page is consuming resources that are coming from an unsecured location, this could be scripts, style-sheets or other types of objects that could potentially pose a security risk since they could be tampered on the way or come from different locations.
As a site owner/developer/publisher/etc should always make sure that you are not going to expose your customers to such a bad experience, leaving them with an answer that they can’t possibly choose right. For one if they ‘choose Yes’ they will get an incomplete experience being broken images, broken scripts or something worse; otherwise they can ‘choose No’ which is even worse since that means you are actually teaching them to ignore this warnings which could indeed in some cases be real signs of security issues.
Bottom-line it should be imperative that any issue like this should be treated as a bug and fixed in the application if possible.
But the big question is how do you find these issues? Well the answer is very simple yet extremely time consuming, just navigate to every single page of your site using SSL and as you do that examine every single resource in the page (styles, objects, scripts, etc) and see if the URL is pointing to a non-HTTPS location.
The good news is that using the SEO Toolkit is extremely simple to find these issues.
Using the IIS SEO Toolkit and it powerful Query Engine you can easily detect conditions on your site that otherwise would take an incredible amount of time and that would be prohibitively expensive to do constantly.
In the URL Rewrite forum somebody posted the question "are redirects bad for search engine optimization?". The answer is: not necessarily, Redirects are an important tool for Web sites and if used in the right context they actually are a required tool. But first a bit of background.
A redirect in simple terms is a way for the server to indicate to a client (typically a browser) that a resource has moved and they do this by the use of an HTTP status code and a HTTP location header. There are different types of redirects but the most common ones used are:
Below is an example on the response sent from the server when requesting http://www.microsoft.com/SQL/
One of the most important factors in SEO is the concept called organic linking, in simple words it means that your page gets extra points for every link that external Web sites have linking to your page. So now imagine the Search Engine Bot is crawling an external Web site and finds a link pointing to your page (example.com/some-page) and when it tries to visit your page it runs into a redirect to another location (say example.com/somepage). Now the Search Engine has to decide if it should add the original "some-page" into its index as well as if it should "add the extra points" to the new location or to the original location, or if it should just ignore it entirely. Well the answer is not that simple, but a simplification of it could be:
IIS Search Optimization Toolkit has a couple of rules that look for different patterns related to Redirects. The Beta version includes the following:
So how does it look like? In the image below I ran Site Analysis against a Web site and it found a few of these violations (2 and 3).
Notice that when you double click the violations it will tell you the details as well as give you direct access to the related URL's so that you can look at the content and all the relevant information about them to make the decision. From that menu you can also look at which other pages are linking to the different pages involved as well as launch it in the browser if needed.
Similarly with all the other violations it tries to explain the reason it is being flagged as well as recommended actions to follow for each of them.
IIS Search Engine Optimization Toolkit can also help you find all the different types of redirects and the locations where they are being used in a very easy way, just select Content->Status Code Summary in the Dashboard view and you will see all the different HTTP Status codes received from your Web site. Notice in the image below how you can see the number of redirects (in this case 18 temporary redirects and 2 permanent redirects). You can also see how much content they accounted for, in this case about 2.5 kb (Note that I've seen Web sites generate a large amount of useless content in redirect traffic, speaking of spending in bandwidth). You can double click any of those rows and it will show you the details of the URL's that returned that and from there you can see who links to them, etc.
So going back to the original question: "are redirects bad for Search Engine Optimization?". Not necessarily, they are an important tool used by Web application for many reasons such as:
Just make sure you don't abuse them by having redirects to redirects, unnecessary redirects, infinite loops, and use the right semantics.
IIS 7.0 Failed Request Tracing (for historical reasons internally we refer to it as FREB, since it used to be called Failed Request Event Buffering, and there are no "good-sounding-decent" acronyms for the new name) is probably the best diagnosing tool that IIS has ever had (that doesn't require Debugging skills), in a simplistic way it exposes all the interesting events that happen during the request processing in a way that allows you to really understand what went wrong with any request. To learn more you can go to http://learn.iis.net/page.aspx/266/troubleshooting-failed-requests-using-tracing-in-iis7/.
What is not immediately obvious is that you can use this tracing capabilities from your ASP.NET applications to output the tracing information in our infrastructure so that your users get a holistic view of the request.
When you are developing in ASP.NET there are typically two Tracing infrastructures you are likely to use, the ASP.NET Page Tracing and the System.Diagnostics Tracing. In recent versions they have been better integrated (attribute writeToDiagnosticsTrace) but still you want to know about both of them.
Today I'll just focus on logging ASP.NET Tracing to FREB, and in a future post I will show how to do it for System.Diagnostics Tracing.
To send the ASP.NET Tracing to FREB you just need to enable ASP.NET tracing, use the ASPNET trace provider and you will get those entries in the FREB log. The following web.config will enable FREB and ASP.NET Tracing. (Note that you need to go to the Default Web Site and Enable Failed Request Filtering so that this rules get executed)
Now if you have a sample page like the following:
The result is that in \inetpub\logs\FailedReqLogsFiles\ you will get an XML file that includes all the details of the request including the Page Traces from ASP.NET. Note that we provide an XSLT transformation that parses the Xml file and provides a friendly view of it where it shows different views of the trace file. For example below only the warning is shown in the Request Summary view:
There is also a Request Details view where you can filter by all the ASP.NET Page Traces that includes both of the traces we added in the Page code.
A lot of sites today have the ability for users to sign in to show them some sort of personalized content, whether its a forum, a news reader, or some e-commerce application. To simplify their users life they usually want to give them the ability to log on from any page of the Site they are currently looking at. Similarly, in an effort to keep a simple navigation for users Web Sites usually generate dynamic links to have a way to go back to the page where they were before visiting the login page, something like: <a href="/login?returnUrl=/currentUrl">Sign in</a>.
If your site has a login page you should definitely consider adding it to the Robots Exclusion list since that is a good example of the things you do not want a search engine crawler to spend their time on. Remember you have a limited amount of time and you really want them to focus on what is important in your site.
Out of curiosity I searched for login.php and login.aspx and found over 14 million login pages… that is a lot of useless content in a search engine.
Another big reason is because having this kind of URL's that vary depending on each page means there will be hundreds of variations that crawlers will need to follow, like /login?returnUrl=page1.htm, /login?returnUrl=page2.htm, etc, so it basically means you just increased the work for the crawler by two-fold. And even worst, in some cases if you are not careful you can easily cause an infinite loop for them when you add the same "login-link" in the actual login page since you get /login?returnUrl=login as the link and then when you click that you get /login?returnUrl=login?returnUrl=login... and so on with an ever changing URL for each page on your site. Note that this is not hypothetical this is actually a real example from a few famous Web sites (which I will not disclose). Of course crawlers will not infinitely crawl your Web site and they are not that silly and will stop after looking at the same resource /login for a few hundred times, but this means you are just reducing the time of them looking at what really matters to your users.
If you use the IIS SEO Toolkit it will detect the condition when the same resource (like login.aspx) is being used too many times (and only varying the Query String) and will give you a violation error like: Resource is used too many times.
There are a few fixes, but by far the best thing to do is just add the login page to the Robots Exclusion protocol.
To summarize always add the login page to the robots exclusion protocol file, otherwise you will end up:
By default in Windows Server 2008 when you are using the Web Management Service (WMSVC) and Web Deploy (also known as MSDeploy) it will use Basic authentication to perform your deployments. If you want to enable Windows Authentication you will need to set a registry key so that the Web Management Service also supports using NTLM. To do this, update the registry on the server by adding a DWORD key named "WindowsAuthenticationEnabled" under HKEY_LOCAL_MACHINE\Software\Microsoft\WebManagement\Server, and set it to 1. If the Web Management Service is already started, the setting will take effect after the service is restarted.
For more details on other configuration options see:
This is the second note of the series:
1: Moving a SitemapPath Control to ASP.NET Web Pages
My current Web Site was built using ASP.NET 2.0 and WebForms, that means that all of my pages have the extension .aspx. While moving each page to use ASP.NET Web Pages their extension is being changed to .cshtml, and while I’m sure I could configure it in a way to get them to keep their aspx extensions it is a good opportunity to “start clean”. Furthermore, in ASP.NET WebPages you can also access them without the extension at all, so if you have /my-page.cshtml, you can also get to it using just /my-page. Given I will go through this migration I decided to use the clean URL format (no extension) and in the process get better URLs for SEO purposes, for example, today one of the URLs look like http://www.carlosag.net/Articles/configureComPlus.aspx but this would be a good time to enforce lower-case semantics and also get rid of those ugly camel casing and get a much more standard a friendly format for Search Engines using “-“, like: http://www.carlosag.net/articles/configure-com-plus.aspx.
The risk of course is that if you just change the URLs of your site you will end up not only with lots of 404’s (Not Found), but your page ranking will be reset and you will loose all the “juice” that external links and history have provided to it. The right way to do this is to make sure that you perform a permanent redirect (301) from the old URL to the new URL, this way Search Engines (and browsers) will know that the content has permanently moved to a new location so they should “pass all the page ranking” to the new page.
There are many ways to achieve this, but I happen to like URL Rewrite a lot, so I decided to use it. To do that I basically created one rule that uses a Rewrite Map (think of it as a Dictionary) to match the URL and if it matches it will perform a permanent redirect to the new one. So for example, if /aboutme.aspx is requested, then it will 301 to /about-me:
Note that I could have also created a simple rule that would change the extension to cshtml, however I decided that I also wanted to change the page names. The best thing is that you can do it incrementally and only rewrite them once your new page is ready or even switch back to the old one later if any problems occur.
Using URL Rewrite you can easily keep your SEO and pages without broken links. You can also achieve lots more, check out: SEO made easy with IIS URL Rewrite 2.0 SEO templates – CarlosAg
This is the third post on the series:
2: Use URL Rewrite to maintain your Page rankings (SEO)
ASP.NET has a nice feature to help for deployment processes where you can drop an HTML file named app_offline.htm and it will unload all assemblies and code that it has loaded letting you easily delete binaries and deploy the new version while still serving back to customers the friendly message that you provide telling them that your site is under maintenance.
One caveat though, is that Internet Explorer users might still see the “friendly” error that they display and not your nice message. This happens because of a page size validation that IE performs. See Scott’s blog on how to workaround that problem: App_Offline.htm and working around the IE Friendly Errors
Note: The live site is now running in .NET 4.0 and all using Razor.
A couple of years ago a friend of mine introduced me to a game called Sudoku, and immediately I loved it. As any good game its rules are very simple, basically you have to lay out the numbers from 1 to 9 horizontally in a row without repeating them, while at the same time you have to layout the same 1 to 9 numbers vertically in a column, and also within a group (a 3x3 square).
After that, every time I had to take a flight I got addicted to buying a new puzzles magazine that would entertain me for the flight. On December 2006 while flying to Mexico I decided to change the tradition and instead build a simple Sudoku game that I could play any time I felt like doing it without having to find a magazine store and that turned into this simple game. It is not yet a great game since I haven't had time to finalize it, but I figure I would share it anyway in case someone finds it fun.
Click Here to go to the Download Page
During the holidays my wife and I went back to visit our families in Mexico City where we are originally from. Again, during the flights I had enough spare time to build a couple of my favorite games, Backgammon and Connect4.
I've already built both games for Windows using Visual Basic 5 almost 11 years ago but as you would imagine I was far from feeling proud of the implementation. So this time I started from scratch and ended up with what I think are better versions of them (still not the best code, but pretty decent for just a few hours of coding). In fact the AI for the Backgammon version is a bit better and the Connect4 is faster and more suited for a Mobile device.
You can go with your PDA/Smartphone to http://www.carlosag.net/mobile/ to install both games or just click the images below to take you to the install page of each of them. Enjoy and feel free to add any feedback/features as comments to this blog post.
The one thing I learned during the development of these versions is that you do want to download the Windows Mobile 6 SDK if you are going to target that version (which is what my cell phone has), since it will add new Visual Studio 2005 Project Templates and new Emulator images which will help you a lot. For example I was trying to use buttons in my forms, and testing it in Pocket PC worked, but as soon as I tried them in my cell phone it crashed with a NotSupportedException. When I installed the SDK and switched to target that platform, Visual Studio immediately warned me that my platform didn't supported buttons which was great.
Bottom line I'm more and more amazed of how easy it is to build games in Windows Mobile and the things you can achieve with both Windows Mobile and the .NET Compact Framework.
One thing that I’ve been asked several times about the SEO Toolkit is if it does a full standards validation on the markup and content that is processed, and if not, to add support for more comprehensive standards validation, in particular XHTML and HTML 4.01. Currently the markup validation performed by the SEO Toolkit is really simple, its main goal is to make sure that the markup is correctly organized, for example that things like <b><i>Test</b></i> are not found in the markup, the primary reason is to make sure that basic blocks of markup are generally "easy" to parse by Search Engines and that the semantics will not be terribly broken if a link, text or style is not correctly closed (since all of them would affect SEO).
So the first thing I would say is that we have heard the feedback and are looking at what we could possibly add in future versions, however why wait, right?
One thing that many people do not realize is that the SEO Toolkit can be extended to add new violations, new metadata and new rules to the analysis process and as such during a demo I gave a few weeks ago I decided to write a sample on how to consume the online W3C Markup Validation Service from the SEO Toolkit.
You can download the SEOW3Validator including the source code at http://www.carlosag.net/downloads/SEOW3Validator.zip.
To run it you just need to:
You should be able to now run the SEO Toolkit just as before but now you will find new violations, for example in my site I get the ones below. Notice that there are a new set of violations like W3 Validator – 68, etc, and all of them belong to the W3C category. (I would have liked to have better names, but the way the W3 API works is not really friendly for making this any better).
And when double clicking any of those results you get the details as reported by the W3 Validation Service:
The code is actually pretty simple, the main class is called SEOW3ValidatorExtension that derives from CrawlerModule and overrides the Process method to call the W3C Validation service sending the actual markup in the request, this means that it does not matter if your site is an Intranet or in the Internet, it will work; and for every warning and error that is returned by the Validator it will add a new violation to the SEO report.
The code looks like this:
I created a helper class W3Validator that basically encapsulates the consumption of the W3C Validation Service, the code is far from what I would like it to be however there are some "interesting" decisions on the way the API is exposed, I would have probably designed the service differently and not return the results formatted in HTML when this is actually an API/WebService that can be presented somewhere else than a browser. So a lot of the code is to just re-format the results to look "decent", but to be honest I did not want to spend too much time on it so everything was put together quite quickly. Also, if you look at the names I used for violations, I did not want to hard-code specific Message IDs and since the Error Message was different for all of them even within the same Message ID, it was not easy to provide better messages. Anyway, overall it is pretty usable and should be a good way to do W3 Validation.
Note that one of the cool things you get for free is that since these are stored as violations, you can then re-run the report and use the Compare Report feature to see the progress while fixing them. Also, since they are stored as part of the report you will not need to keep running the validator over and over again but instead just open it and continue looking at them, as well as analyzing the data in the Reports and Queries, and be able to export them to Excel, etc.
Hopefully this will give you a good example on some of the interesting things you can achieve with the SEO Toolkit and its extensibility.