If you’re talking about accessibility, the likely answer is to embed.

If we look at items that make file formats less than optimally accessible, fingers immediately point to Open XML's lack of complex table headers. I can't disagree that allowing some structures in your document to default to generic markup isn’t optimal: it isn't. However, ATs are able to report to users an accurate representation of the content of the table in Open XML, even though they cannot always tell the user whether there is any formatting that might lend a hand in better understanding the table structure. The information available is accurate, but not as informative as it could be. 

On the other hand, not providing a mechanism for embedding fonts can cause ATs to deliver both incomplete and inaccurate representations to users. The standards for Open XML and PDF have documented mechanisms for embedding fonts in conformant documents. PDF/A requires font embedding for accessibility.  With ODF, it’s not entirely clear whether embedded fonts are allowed or not, since the concept is not addressed in the standard.

While the embeddability of fonts is controlled by licensing, most mainstream fonts available today allow some level of embedding. Embedding allows all the support needed to print or render characters in a font to travel with the document where the font is used.

Critics of font embedding argue that it can bloat files. This is true. Adding a bunch of data to support all the characters and all the other information in all the fonts you use in a document can contribute to creating large files. Subsetting the font, which embeds only the information needed to support the characters in the document at the time that the font is embedded, can mitigate this concern to a great extent.  There are a few drawbacks to subsetting.  Most notable is that if you try to add text to a document that contains characters outside that subset, you might get some odd character display because the editing application will have to substitute another font for those characters. Most medium to long  documents in languages using Latin-based alphabets include a wide enough variety of characters as to give subsequent editors of the document plenty to work with, however. This is a larger concern in languages with larger character sets, such as Japanese and Chinese. In those languages, full embedding is going to be a better bet.

Okay, but is embedding of fonts really that important for accessibility? You bet. Embedding a font in a document allows anyone viewing that document to process the document exactly as the author intended. Whether fully embedded or embedded as a subset, the font information can be vital to ensuring that characters in the document are represented correctly.

Won’t substituting another font work just as well? No. While most fonts have relatively common character mappings, this is not true of all fonts. There are multiple ways to break font substitution as you can see in the following examples.

The most extreme example would be to compare a symbol font like Wingdings with an alphabet font like Calibri. While extreme, there have been numerous examples of this actually being problematic – bad font substitution involving Wingdings fonts is literally the stuff of which conspiracy legends have been made. To see how incorrect font substitution can affect your text, type any word in Calibri, select it and reformat it to Wingdings. Instead of getting characters you now show symbols in your document. This is exactly what would happen if an application, such as a word processing program or a browser, were to substitute Wingdings for your Calibri were Calibri not installed on your machine.

Select those symbols I just mentioned above and reformat to another alphabetical font such as Helvetica. Once again, you have the same text as you did in the original Calibri text (unless you’ve chosen a proprietary symbol font with weird mappings instead of Wingdings, of course).  On the other hand, if you have your content in Wingdings and change the formatting to use another symbol font, you may get an entirely different set of symbols that what you expected from having seen that content in Wingdings. If, for instance, you used Wingdings to insert content in your document that looked like a picture of an open book, in Webdings, that same character code instead maps to a symbol that looks like a drawing of a medal. That can really change the meaning.

The reason that you get the same text in most of the alphabetical fonts in most situations is that they use the same or similar mapping conventions for each character code point.  Wingdings uses the same character code points to map to entirely different types of content. As I said, symbols fonts versus alphabetical fonts is an extreme example of what can happen. But noticed that I said “most of the alphabetical fonts in most situations” use the same or similar mapping conventions. Not all do. And many of those differences can be subtle, and can either change or destroy meaning in text.

One of the most common occurrences these days of inappropriate font substitutions pops up when companies using proprietary fonts for their logos forget to either embed the font for the logo in every document or don’t turn the logo into an image so that no substitution occurs. This can actually point to either bad font substitutions in the “main” area of the font, or can indicate that the company is using the PUA – the Unicode Private Use Area of the font – to display a logo or other proprietary character. Take for example the “points” character that Microsoft uses to represent units for purchasing Zune or Xbox content. The character is F001 in the Zune UI fonts, which is great if you have those fonts on your system or we’ve done our job to ensure correct embedding of the character. If not, you might see a “?” character, a square box of some sort, or a character that lives at that location in some font that you do have on your system. For AT devices, this mapping will, again, destroy the meaning of the text. AT devices won’t always know how to present PUA characters to the user, but at least with font embedding a device stands a chance to present the user with common options … and get it right sometimes.

The PUA, while used in just about every language by someone, is especially heavily used in countries with non-Latin based characters, especially in languages like Japan and China that have thousands of characters rather than just the 26 letters of the English alphabet and a few accented characters to cover European languages. Imagine that you are Japanese and your name logogram is represented by that F001 code point in the Japanese font you use for your signature or business logo, but when your Danish counterpart receives the file and opens it, the logogram is replaced with the “points” character in the example above. Now you have nonsense in your document, nonsense that replaced an important piece of information in your original document.

And one last example of how easy it is for a font to break font substitution. You might have run into this yourself if you were one of the first to adopt Windows 7 when it released but didn’t immediately download the first update for the OS (KB974431). Windows 7 includes updates to several fonts including Calibri and Cambria. Without KB974431, when you use these fonts in Office 2010 Technical Beta or Office2007 to Save as PDF then open the resulting PDF file in, for instance, Adobe Reader, ligature pairs such as “ff”, “ti”, and “tt” would have been missing from your document. Oops. Turned out that the updates to Calibri and Cambria use a new GSUB (glyph substitution) lookup. That’s okay, except that the APIs that Office uses for Save as PDF didn’t know it needed to handle this new lookup and Adobe Reader didn’t know that they needed to do anything different with these fonts. That was fixed in KB974431 by updating the font substitution and embedding APIs for Windows 7 and Windows Server 2008 R2. Enabling the AT to present “the little giraffe” instead of “the li  le gira  e” goes a long way in allowing all users to understand your meaning.

Without font embedding, ATs will struggle or completely fail, through no fault of their own, to present usable content to their users. With characters in the PUA ranges, an AT may not know in all instances how to present a user with the correct information, but they at least stand a chance with font embedding. Why should a blind user, for instance, have to try and guess what “the li  le gira  e” means? Why should anyone have to guess at the contents of a document? Font substitution can’t guarantee that the information presented to users will ever be right. Implementers can hope, but can never be sure.

ISO 19005-1 (PDF/A) states the value of font embedding quite well in the introductory phrase to its font section: Font embedding is important to “ensure that the accessible rendering of the textual content of a conforming file matches, on a glyph by glyph basis, the static appearance of the file as originally created and to allow the recovery of semantic properties for each character of the textual content”.

If glyph-by-glyph rendering can’t be guaranteed, a file format cannot be inherently accessible.