If broken it is, fix it you should

Using the powers of the debugger to solve the problems of the world - and a bag of chips    by Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft)

Displaying HTML Content in a RichTextBlock

Displaying HTML Content in a RichTextBlock

Rate This
  • Comments 3
 

[This post is a part of a series of posts about the Social Media Dashboard Sample.  For an introductory blog post with information and links to the Social Media Dashboard sample click here]

When you want to display HTML content from a blog RSS feed or other sources in a Windows 8 app you can display this in a WebView but the problem with the WebWiew is that you can’t use your own background or disable zooming which means that you will have a rather crude UX experience.

One thing you can do is inject CSS but if you want to have more control over the content you can display it in a RichTextBlock instead.

The RichTextBlock control doesn’t display HTML as is, so in the Social Media Dashboard Sample we parse the HTML and add native Windows 8 controls to the RichTextBlock.

Defining the Properties.Html Dependency Property

On the ScrollingBlogPostDetailPage.xaml we add a RichTexBlock and databind the HTML content to the rtbx:Properties.Html property like this

<RichTextBlock x:Name="textContent" FontSize="16" IsTextSelectionEnabled="True" FontFamily="Segoe UI" 
Foreground="{StaticResource AppDarkColor}" rtbx:Properties.Html="{Binding Content}" >

 

In order for this to work we need to create the extension property Html separately and that is where we parse the content and build the Control tree.

The xml namespace rtbx in the sample above is defined as mlns:rtbx="using: SocialMediaDashboard.Common"where SocialMediaDashboard.Common is where we define the Properties class.

The properties class has a property Html, and when this property is set, the HtmlChanged method is called

public static readonly DependencyProperty HtmlProperty =
DependencyProperty.RegisterAttached("Html", typeof(string), typeof(Properties), new PropertyMetadata(null, HtmlChanged));

 

The HtmlChanged method captures the HTML, generates the various blocks used in the RichTextBlock and adds them to the RichTextBlock Control

        private static void HtmlChanged(DependencyObject d, DependencyPropertyChangedEventArgs e)
        {
            RichTextBlock richText = d as RichTextBlock;
            if (richText == null) return;

            //Generate blocks
            string xhtml = e.NewValue as string;

            string baselink = "";
            if (richText.DataContext is BlogPostDataItem)
            {
                BlogPostDataItem bp = richText.DataContext as BlogPostDataItem;
                baselink = "http://" + bp.Link.Host;
            }

            List<Block> blocks = GenerateBlocksForHtml(xhtml, baselink);

            //Add the blocks to the RichTextBlock
            richText.Blocks.Clear();
            foreach (Block b in blocks)
            {
                richText.Blocks.Add(b);
            }
        }

 

It also performs a few more housekeeping functions like getting a base link that we can use to generate absolute links for images as many bloggers tend to just use relative links. Because we know in this app that we will always be databound to a BlogPostDataItem we can use the link from there.

The method GenerateBlocksForHtml is where we start the parsing and this will eventually generate all the blocks we need to add to the RichTextBlock.

        private static List<Block> GenerateBlocksForHtml(string xhtml, string baselink)
        {
            List<Block> bc = new List<Block>();

            try
            {
                HtmlDocument doc = new HtmlDocument();
                doc.LoadHtml(xhtml);

                foreach(HtmlNode img in doc.DocumentNode.Descendants("img")){
                    if (!img.Attributes["src"].Value.StartsWith("http"))
                    {
                        img.Attributes["src"].Value = baselink + img.Attributes["src"].Value;
                    }
                }

                Block b = GenerateParagraph(doc.DocumentNode);
                bc.Add(b);
            }
            catch (Exception ex)
            {
            }

            return bc;
        }

 

In here we generate the starter block (a paragraph block) that contains all other blocks, and the GenerateParagraph method goes through the content of the HTML root node and adds blocks to this one base paragraph Control.

        private static Block GenerateParagraph(HtmlNode node)
        {
            Paragraph p = new Paragraph();
            AddChildren(p, node);
            return p;
        }

We then go through and add Inline elements for all the children of the root HTML node. This method is later called recursively until we get to the leaf HTML node

        private static void AddChildren(Paragraph p, HtmlNode node)
        {
            bool added = false;
            foreach (HtmlNode child in node.ChildNodes)
            {
                Inline i = GenerateBlockForNode(child);
                if (i != null)
                {
                    p.Inlines.Add(i);
                    added = true;
                }
            }
            if (!added)
            {
                p.Inlines.Add(new Run() { Text = CleanText(node.InnerText) });
            }
        }
 

Depending on the type of the node we either generate a Span, Linebreak, Hyperlink, or Paragraph with Image etc. or combinations thereof and then continue adding the children as new Inlines. Part of the GenerateBlocksForNode method is shown below… for the full code look in RichTextBlockProperties.cs file in the SocialMediaDashboard sample

        private static Inline GenerateBlockForNode(HtmlNode node)
        {
            switch (node.Name)
            {
                case "div":
                    return GenerateSpan(node);
                case "p":
                case "P":
                    return GenerateInnerParagraph(node);
                case "img":
                case "IMG":
                    return GenerateImage(node);
                case "a":
                case "A":
                    if (node.ChildNodes.Count >= 1 && (node.FirstChild.Name == "img" || node.FirstChild.Name == "IMG"))
                        return GenerateImage(node.FirstChild);
                    else
                        return GenerateHyperLink(node);
                    …

 

GenerateSpan looks like this for example

        private static Inline GenerateSpan(HtmlNode node)
        {
            Span s = new Span();
            AddChildren(s, node);
            return s;
        }

 

Conclusion

HTML is obviously a pretty rich markup language and it would take forever to write code to parse and display all types of elements so this sample just implements the most common ones but as it turns out it still captures most of the elements used in the blog posts that we have encountered so far, and the model used here is pretty easy to expand for the specific situations you might encounter.

 

For more information about Windows 8 app development, go here.
For more information about Windows Phone development, go here.

  • Thank you Tess. This is really elegant solution.

  • Nice stuff! Is there a way to achieve the same result for RichEditBox instead of RichTextBlock?

  • Can't you use HtmlAgilityPack there in some way?

Page 1 of 1 (3 items)
Leave a Comment
  • Please add 8 and 7 and type the answer here:
  • Post