International Mailto URIs in IE7

IEBlog

Windows Internet Explorer Engineering Team Blog

International Mailto URIs in IE7

  • Comments 14

Introduction

New to IE7 is more reliable and standards-compliant support for international mailto URIs. This post will describe how users, application developers, and web developers can use this new feature of IE7.

The following is a simple example of a mailto URI which, when clicked, will launch your default email client to send a new message:

mailto:name@example.com

In IE6, mailto URIs containing characters not found in your system codepage may or may not appear in your mail client correctly, depending on the codepage of the document containing the mailto URI. But in IE7, mailto URIs with characters not found in your system codepage are handled in a standards-compliant manner, which will work regardless of the codepage of the document containing the mailto URI.

How to Use International Mailto URIs in IE7

For those of you who aren’t interested in how this feature works I’ll describe how to use this feature first. It requires two setup steps. First, you must have a mail program that correctly handles these mailto URIs, such as Outlook 2007 or Mozilla Thunderbird. Second, you must ensure that the ‘Use UTF-8 for mailto links’ checkbox on the ‘Advanced’ tab of your Internet Options is checked. Note that you will have to restart IE after changing the value of this checkbox in order for the change to take effect.

Mailto URIs Settings

Once you have a mail program that correctly interprets international mailto URIs and you have checked the checkbox you can reliably use mailto URIs that contain non US-ASCII characters.

Mailto URI Syntax Quick Overview

The mailto URI scheme allows you to specify various parameters associated with the creation of an email including among other things the recipients, the subject, and the body. For example the following mailto URI specifies an email that is sent to name@example.com, with the subject ‘mailto URIs’, and the body ‘I read your mailto blog post’

mailto:name@example.com?subject=mailto%20URIs&body=I%20read%20your%20mailto%20blog%20post

This is just a simple example of the syntax. For more information on mailto URIs, see the mailto URI scheme standard. What that standard won’t cover is how to include characters outside of US-ASCII in your addresses or body content. The IRI standard covers including non US-ASCII characters in URIs in general and that was the basis for this feature in IE7. Additionally, there’s a draft status document of a new mailto URI scheme standard that will obsolete the current mailto URI scheme standard and it includes specific information about non US-ASCII characters in mailto URIs.

How It Works In IE7

When IE7 is configured as described above, if you click on a mailto URI containing non US-ASCII characters, those characters will be converted to UTF-8 and then percent-encoded and the newly encoded URI will be passed to the mail client.

For example, consider the following mailto URI that contains a ‘®’ character in the subject line:

mailto:name@example.com?subject=Microsoft®

If you click on this link, IE will start your mail client with the following mailto URI:

mailto:name@example.com?subject=Microsoft%C2%AE

This is because the character ‘®’ is represented in UTF-8 by the byte sequence {C2, AE} which after being percent-encoded is the text ‘%C2%AE’.

No special consideration is made for IDN host names. If the mailto URI specifies an address that has the ASCII version of an IDN host, the address will be passed through without conversion to or from IDN. For example the following mailto URI will be passed unchanged to the mail client:

mailto:name@xn--lba.example.com

If an address contains the Unicode version of an IDN host, it will be converted to percent-encoded UTF-8 as in the first example and not to or from IDN. For example, IE7 will convert the following mailto URI from:

mailto:name@®.example.com

To the following when passing it to a mail client:

mailto:name@%C2%AE.example.com

If you are an application developer and would like to handle mailto URIs from IE7 appropriately you can check whether IE7 will be sending you these new style international mailto URIs by checking the following registry key:

[HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings\Protocols\Mailto] "UTF8Encoding"=dword:00000001

This registry key changes with the checkbox described in the ‘How to Use International Mailto URIs in IE7’ section. When this registry key is set to 1, IE7 will send new style mailto URIs as described in this section. Otherwise, if the value is 0 or missing, IE will send old style mailto URIs as described in the next section.

Legacy Mode

If you use an earlier version of IE or do not configure IE7 to use standards-compliant mailto URIs, the behavior involving non US-ASCII characters in mailto URIs is more complicated.

When you click on a mailto URI with non US-ASCII characters in it, the URI is passed to the mail client encoded in the codepage corresponding to the character encoding of the document in which the URI was found. If a document doesn’t explicitly specify its character encoding then one is picked based on the content in the document and the current system codepage. You can view and change the character encoding of the current document by going to the IE menu item ‘View’ then ‘Encoding’. For more information on character encodings check out the excellent document W3C I18N Tutorial: Character sets & encodings in XHTML, HTML and CSS.

For example, the following HTML snippet has a mailto URI with a non US-ASCII character that I’ve picked arbitrarily (Unicode character U+3113 named ‘Bopomofo Letter ZH’).

<a href=”mailto:name@example.com?subject=&#x3113;”>example</a>

Suppose the same HTML document specifies its character encoding to be the Big5 encoding in the following fashion:

<meta http-equiv="Content-Type" content="text/html; charset=big5"/>

The Unicode character U+3113 is represented in Big5 by the byte sequence {A3, A4}. When sent to the mail client the mailto URI will be converted to Big5 with the non US-ASCII character converted to the byte sequence {A3, A4}.

Suppose instead that the HTML document specified its character encoding to be the GB2312 encoding like so:

<meta http-equiv="Content-Type" content="text/html; charset=gb2312"/>

The Unicode character U+3113 is represented in GB2312 by the byte sequence {A8, D3}. When sent to the mail client, the mailto URI will be converted to GB2312 with the non US-ASCII character converted to the byte sequence {A8, D3}.

This is a problem for the mail client handling the mailto URIs, since it has no guarantee of how non US-ASCII characters have been encoded. In versions of Outlook prior to the 2007 version, Outlook would assume the system codepage had been used to encode the URI. This means that this scenario would only work with older versions of Outlook, if the document you’re viewing has the same character encoding as your current system codepage.

Authoring International Mailto URIs

When you put an international mailto URI in a document, represent the non US-ASCII characters using the encoding of the document rather than percent-encoding the non US-ASCII characters in UTF-8. This will allow the browser to handle the mailto URI in the manner the user expects.

For example, if you wanted to include the previous example mailto URI in an HTML document you should use HTML encoding to represent the non US-ASCII character as ‘#x3113’.

<a href=”mailto:name@example.com?subject=&#x3113;”>example</a>

This will work in both IE7 and Legacy Mode under the conditions described in the previous sections.

You shouldn’t use percent-encoded UTF-8 to represent the non US-ASCII character as ‘%E3%84%93’ as it will not work in Legacy Mode:

<a href=”mailto:name@example.com?subject=%E3%84%93”>example</a>

If you use percent-encoded UTF-8 for the non US-ASCII characters, the browser will not modify the mailto URI and it will be passed straight to the mail client. If that mail client is an older version of Outlook then the non US-ASCII characters won’t be interpreted correctly.

The point is that when the non US-ASCII character is represented directly in the document’s encoding, the browser is given a chance to convert it into something the mail client understands. So the end user is more likely to be able to use that URI even if they’re using a previous version of IE or Outlook.

Conclusion

In this post, I’ve described how international mailto URIs are handled by IE6 and the improvements we’ve made for IE7 when a standards-compliant mail client is installed.

If you have any questions or comments on this topic, please leave me a note in the comments section.

Dave Risney
Software Design Engineer

Edit: link adjustment

  • Loading...