image This week I was pleased to run a side session at the MVP Summit on Search Engine Optimisation.  One of the topics I covered was SEO tools, as part of which I various tools which are available for analysing websites and extracting data to understand how the crawlers see individual pages or sections of the sites. 

Whilst there are a many different SEO analysis tools available which provide a variety of features and datasets, I personally favour those which enable me to extract raw data and then manage it in Excel, Access, SQL Server or another tool.

Within the Microsoft support SEO project we have found backlink data extremely useful to analyse.  When I saw ‘Backlink data’, I am referring to information to show which sites are linking to your site, and which pages on your site they are linking to.  This data can be extremely valuable for identifying additional link building opportunities, finding problems with invalid URLs pointing to your site and discovering the most powerful pages/sections of your site in terms of PageRank.  Since the only way to know which websites are linking to yours is to crawl the web (and it’s pretty big), backlink data can only be generated by search engines or companies who have their own web crawlers.  The good news is that you can get a complete list of backlinks pointing to your website, for free.  In fact depending on where you get the data from, you can even more data than just the links them self…

Method 1: Bing webmaster tools

Once you are validated for your Bing webmaster tools account, you will be able to view and export a list of backlinks which Bing knows about pointing to your website…

image

Unlike the two other solutions below, Bing will not currently provide you with a list of individual pages on your site where the backlinks are pointing TO.  However, Bing does provide a nice filtering feature which enables you to see only links coming from a particular domain, subdomain or directory.  E.g. here is a the list of backlinks filtered for support.microsoft.com coming from www.microsoft.com/uk

image 

Bing will allow you to export the results in to Excel or another tool, however it will currently only allow you to export 1000 results.

Method 2: Google Webmaster tools

Google webmaster tools also provide backlink data for your website.  Google allows you to see exactly which of the pages on your site have the most links…

image

Google will also allow you to export the data, but does not limit you to 1000 backlinks, so you can download EVERY single link which is pointing to your site, and the URL which it is pointing to.  If you have a big site, the file will be pretty large, so you may not be able to load it straight to Excel.  We recently extracted this file for support.microsoft.com (as a comma delimited file) and then imported it in to Microsoft Access using the External data import option….

image

The Google data also contains links from within the same domain as the website.  We recently used this information to analyse the links pointing to http://support.microsoft.com from local (non-English) www.microsoft.com pages, so that we could optimise the links to point to local content, and increase the search relevancy for our international customers.

You get a lot of data when exporting from Google, but it can be very interesting and useful to analyse.  It’s also worth considering comparisons between the Bing and Google data, to use as an indicator to differences in the information the two search engines may have about your site.

 

 

Method 3: Majesticseo.com

Majesticseo.com are a company who have their own web crawler (as Bing and Google do), their own web index built from all of the pages they regularly crawl (as Bing and Google do), but the difference is that Majesticseo.com provide the ability to analyse and extract the data within their web index for SEO analysis.

image

Whilst they do provide paid for services if you are interested in analysing data for other (i.e. competitor) sites, they provide FREE access to data about your own site if you validate yourself as an owner.

They do provide some web based analysis tools, although in my opinion, the real power behind the Majestic SEO data comes when you export the file and pull it in to your favourite database application.  We now regularly extract this information, and load it in to an SQL server for analysis.

Like Google, Majestic SEO allows to export ALL data for your site, however they go a step further by providing an ‘ACRank’ value, which is their version of PageRank and provides an indicator to the ‘strength’ of every page on your site in terms of the number, diversity and quality of inbound links pointing to it.  Majestic SEO ranking value is based on a scale of 0 to 15, rather than 0-10 like Google’s PageRank.

We are currently using the Majestic SEO data to identify top ranked pages on support.microsoft.com, and we have had a couple of surprises!  For example, this page is one of highest ranked pages…

http://support.microsoft.com/gp/howtoscript

image

…which is simply a page we use to notify customers how to enable scripting if they have it disabled.  The reason this is ranking so highly is because we link to it by default on most of our pages if customers have scripting disabled, but also customers have discovered the page and decided to link to it from their own sites to instruct their users how to enable scripting. 

This data has lead to many more useful insights for us.  If you are interested in knowing which sections/pages of your site are getting the most link juice flowing in to them, I really recommend downloading your Majestic SEO data.

So there you have it, three ways of understanding what the search engines crawlers know about your site.  Enjoy :-) Let me know if you come up with any clever ways of using this data – I would love to do a follow up blog post in future!

Author: Chris Moore is a Program Manager working on Search Engine Optimisation at Microsoft.  http://www.twitter.com/chrismdotcom

 Share