Welcome to MSDN Blogs Sign in | Join | Help

XAML FlowDocument to HTML Conversion Prototype

XAML FlowDocuments and HTML have some things in common. But they also have some distinct differences that makes writing a conversion utility tricky. A well written XSLT could potentially process XHTML input and generate FlowDocument content... But this pre-supposes well-formed HTML in the first place. I've tried to go down this road on a few occasions with limited success.

Since most HTML isn't well formed, a more flexible solution was to build a conversion library. The attached application contains class libraries capable of converting from HTML to FlowDocument, or from FlowDocument to HTML. I can't emphasize enough that this is simply a prototype -- true fidelity of content is not promised nor is it expected. However, if you're interested in playing with (and potentially improving upon) a conversion prototype, you'll find the attached project very useful. The user interface is basic -- simply a TextBox into which you can paste content for conversion. The converted content appears in the same TextBox after the "Convert!" button is pressed.

These classes can also be used to process the entire contents of a folder and turn all of the HTML contained therein to XAML. We're using a similar technique for our updated version of the SDKViewer demo that will ship with the RC1 SDK (so content will be up to date in future versions of the application, unlike the present circumstance). You could do something similar, using a foreach loop, like the following:

C#

Directory.CreateDirectory("test");
string[] myString = Directory.GetFiles(filepath);
foreach (String s in myString)
    {
    FileStream htmlFile = new FileStream(s, FileMode.Open, FileAccess.Read);
    StreamReader myStreamReader = new StreamReader(htmlFile);
    File.WriteAllText(("test\\" + s + ".xaml"), (HtmlToXamlConverter.ConvertHtmlToXaml(myStreamReader.ReadToEnd(), true)),Encoding.UTF8);
    }

This conversion library isn't perfect -- but it gives you a big head start if your enterprise is considering conversion from HTML to FlowDocuments. Converting your HTML content would allow you to take advantage of the enhanced reading capabilities of WPF, including paginated content, magnfication, and annotations.

For more information on the WPF document platform, see this SDK topic:

http://windowssdk.msdn.microsoft.com/library/en-us/wpf_conceptual/html/6e8db7bc-050a-4070-aa72-bb8c46e87ff8.asp?frame=true

Good luck.

-Keith


About Us

We are the Windows Presentation Foundation SDK writers and editors.
Published Thursday, May 25, 2006 10:00 AM by wcsdkteam
Filed under:

Attachment(s): html2flow.zip

Comments

# re: XAML FlowDocument to HTML Conversion Prototype

Thanks!
I've been trying to come up with a good solution for this problrm for a while (I've got as far as playing with some regular expressions but it's fragile).
Thursday, May 25, 2006 3:40 PM by Kevin Daly

# re: XAML FlowDocument to HTML Conversion Prototype

You might want to have a look at Chris Lovett's SGMLReader class, as this is a slot-in replacement for an XMLReader AND will read an HTML file.  This then can be piped into an XSL stylesheet and transformed directly into XAML.

Thursday, March 08, 2007 8:18 AM by deadlyvices

# any way to display images?

Hello,

great, but no images?

Sunday, May 27, 2007 5:19 PM by eclere

# re: XAML FlowDocument to HTML Conversion Prototype

Nice but doesn't seem to do table borders any way I can work out

Tuesday, February 26, 2008 5:38 AM by jim_ej
Anonymous comments are disabled
 
Page view tracker