<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>.NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx</link><description>!! Update 06/08/18 !! Html Agility Pack has a new home on CodePlex! Available here . CodePlex is great :) !! Update 05/05/05 !! Visual Studio 2005 Beta2 version is available here !! Update 05/23/05 !! This blog will be discontinued. A new blog were comments</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#16029</link><pubDate>Thu, 05 Jun 2003 07:33:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:16029</guid><dc:creator>David Stone</dc:creator><description>Thanks! Definitely going into my toolkit!</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#17064</link><pubDate>Thu, 19 Jun 2003 16:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:17064</guid><dc:creator>Robert Cannon</dc:creator><description>I have run across an issue with HtmlAgilityPack.  I am trying to scrape a site that has some HTML added to the end of the document by the ISP that is hosting the site.
&lt;br&gt;

&lt;br&gt;
It is something like this:
&lt;br&gt;

&lt;br&gt;
&amp;lt;HTML&amp;gt;
&lt;br&gt;
...
&lt;br&gt;
&amp;lt;/HTML&amp;gt;
&lt;br&gt;
&amp;lt;!-- text below generated by server. PLEASE REMOVE --&amp;gt;&amp;lt;!-- Counter/Statistics data collection code --&amp;gt;&amp;lt;!-- JS Banner blocked --&amp;gt;
&lt;br&gt;
&amp;lt;script&amp;gt;
&lt;br&gt;
...
&lt;br&gt;
&amp;lt;/script&amp;gt;
&lt;br&gt;

&lt;br&gt;
HtmlAgilityPack will parse this and then wrap the whole thing in a &amp;lt;span&amp;gt; to give the document a single root, which is the &amp;lt;span&amp;gt; node rather than the &amp;lt;HTML&amp;gt; node.
&lt;br&gt;

&lt;br&gt;
Is there an option to either 1) ignore the extra markup, or 2) force the extra markup into the &amp;lt;HTML&amp;gt; node?
&lt;br&gt;
</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#17074</link><pubDate>Thu, 19 Jun 2003 16:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:17074</guid><dc:creator>Simon Mourier</dc:creator><description>I think it does so because you are setting OptionOutputAsXml to True. In XML, you need a root node without siblings. HtmlAgilityPack creates this fake root node to build valid XML. Just don't use this OptionOutputAsXml.
&lt;br&gt;

&lt;br&gt;
Does this answer / solves your problem?</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#17198</link><pubDate>Fri, 20 Jun 2003 15:26:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:17198</guid><dc:creator>Robert Cannon</dc:creator><description>Yes, I have OptionOutputAsXml set to true.  I am trying to produce an XHTML file so that I can apply an XSLT to and get an RSS feed.  I guess I could make my XSLT to expect the root node to be a &amp;lt;span&amp;gt; instead of an &amp;lt;html&amp;gt;, but it just seems wrong.  I was looking for alternatives.
&lt;br&gt;

&lt;br&gt;
And I have the source code, so I can tackle it myself, but I just wanted to see if there was already a workaround.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#17200</link><pubDate>Fri, 20 Jun 2003 15:32:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:17200</guid><dc:creator>Simon Mourier</dc:creator><description>You do not need to produce an XHTML file to apply an XSLT to the document (and you should not) The HtmlDocument class supports IXPathNavigable natively for this kind of purpose, so you can just do:
&lt;br&gt;

&lt;br&gt;
HtmlDocument doc = ...
&lt;br&gt;
XslTransform xslt = new XslTransform();
&lt;br&gt;
xslt.Load(&amp;quot;myXslt.xsl&amp;quot;);
&lt;br&gt;
xslt.Transform(doc, null, writer);
&lt;br&gt;

&lt;br&gt;
</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#42207</link><pubDate>Tue, 09 Dec 2003 22:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:42207</guid><dc:creator>Rossella</dc:creator><description>I download Html Agility Pack sourse and I compiled it but it is not executable. When I click on &amp;quot;Debug&amp;quot; and &amp;quot;Go&amp;quot; menu a dialogue window appear. How I execute this project?&lt;br&gt;How I interact with a page to parse?&lt;br&gt;&lt;br&gt;Thank you.&lt;br&gt;&lt;br&gt;E-mail: billarosa@hotmail.com</description></item><item><title>HtmlAgilityPack and dangerous HTML Tags stripping</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#44860</link><pubDate>Sat, 20 Dec 2003 20:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:44860</guid><dc:creator>Julien Cheyssial's Blog</dc:creator><description /></item><item><title>HtmlAgilityPack et suppression des tags HTML dangereux</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#44862</link><pubDate>Sat, 20 Dec 2003 20:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:44862</guid><dc:creator>Julien Cheyssial's Blog</dc:creator><description /></item><item><title>HTML Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#70746</link><pubDate>Tue, 10 Feb 2004 21:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:70746</guid><dc:creator>Code/Tea/Etc...</dc:creator><description /></item><item><title>HTML Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#70747</link><pubDate>Tue, 10 Feb 2004 21:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:70747</guid><dc:creator>Code/Tea/Etc...</dc:creator><description /></item><item><title>Html Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#70796</link><pubDate>Tue, 10 Feb 2004 22:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:70796</guid><dc:creator>ShowUsYour</dc:creator><description /></item><item><title>Html Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#70854</link><pubDate>Wed, 11 Feb 2004 00:24:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:70854</guid><dc:creator>ShowUsYour</dc:creator><description /></item><item><title>Parsing malformed HTML</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#71541</link><pubDate>Thu, 12 Feb 2004 01:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:71541</guid><dc:creator>William.Blog()</dc:creator><description /></item><item><title>Parsing malformed HTML</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#71543</link><pubDate>Thu, 12 Feb 2004 01:19:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:71543</guid><dc:creator>William.Blog()</dc:creator><description /></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#72285</link><pubDate>Fri, 13 Feb 2004 13:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:72285</guid><dc:creator>Eric Newton</dc:creator><description>There's another, the SgmlReader, which is a more structured approach to taming the HTML beast for scraping processes.&lt;br&gt;&lt;br&gt;&lt;a target="_new" href="http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC"&gt;http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC&lt;/a&gt;</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#72330</link><pubDate>Fri, 13 Feb 2004 17:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:72330</guid><dc:creator>Simon Mourier</dc:creator><description>Absolutely, but as you say, it uses a more structured approach, and thus modifies &amp;quot;real world&amp;quot; html, which I think is a big problem for many scenarios.&lt;br&gt;&lt;br&gt;Do this test:&lt;br&gt;1) go to www.microsoft.com, do a view source and save the file as mshome.htm (don't bother with images, .js and all satellite files)&lt;br&gt;&lt;br&gt;2) run commandline\sgmlreader.exe mshome.htm mshome2.htm&lt;br&gt;&lt;br&gt;3) open an IE on mshome.htm and another on mshome2.htm and you will see they are not rendered the same (fonts, tables, etc...)&lt;br&gt;&lt;br&gt;HtmlAgilityPack does not change original html, even if it's malformed.&lt;br&gt;Simon.</description></item><item><title>Turning Poorly-formed HTML into XML in Managed Code</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#72591</link><pubDate>Fri, 13 Feb 2004 21:14:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:72591</guid><dc:creator>Paroxysmal Effluence</dc:creator><description /></item><item><title>Improvements in signal-to-noise ratio with Jon's Radio</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#76184</link><pubDate>Thu, 19 Feb 2004 13:17:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:76184</guid><dc:creator>Paul's Imaginary Friend</dc:creator><description /></item><item><title>re: .NET Html Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#86893</link><pubDate>Wed, 10 Mar 2004 08:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:86893</guid><dc:creator>Mike</dc:creator><description>Awesome, the SgmlReader was great, but this is even better! Way to code up the right tool!</description></item><item><title>MSHTML vs. Html Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#89410</link><pubDate>Mon, 15 Mar 2004 05:19:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:89410</guid><dc:creator>Taylor Monacelli</dc:creator><description>I'm curios what is the difference between html agility pack and mshtml.  I'm assuming that the agility pack was written to fix the problems in mshtml.  Is this true?  If not, then what does the agility pack have to offer that mshtml doesn't?</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#89426</link><pubDate>Mon, 15 Mar 2004 06:21:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:89426</guid><dc:creator>Simon Mourier</dc:creator><description>They are quite different libraries, not really comparable in my opinion.&lt;br&gt;&lt;br&gt;MSHTML is a COM dll, not a .NET assembly (although you can interop with it), with everything that implies in terms of deployment.&lt;br&gt;MSHTML has many many dependencies on other DLLs, while Html Agility Pack has absolutely none (in either technical terms or standard ISO terms). MSHTML is client side oriented and has a lot to do with UI and is therefore not suited (at all) for server side operations. And it is somehow strict on HTML code while Html Agility Pack is really not. This is very usefull when you're talking about real world HTML (read: buggy HTML).&lt;br&gt;&lt;br&gt;Html Agility Pack's purpose is less more ambitious, it basically just parses an HTML fragment (file or stream), builds a DOM out of it and allows you to modify it and save it back. It has however a killer feature that MSHTML does not have: support for XPATH and XSL transforms on plain old buggy malformed HTML code...&lt;br&gt;&lt;br&gt;Hope this clarifies.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#89671</link><pubDate>Mon, 15 Mar 2004 18:59:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:89671</guid><dc:creator>Sudhir Ramdasi</dc:creator><description>This is just great! It serves my purpose.&lt;br&gt;&lt;br&gt;Thanks a lot.&lt;br&gt;&lt;br&gt;-Sudhir</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#93375</link><pubDate>Sun, 21 Mar 2004 12:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:93375</guid><dc:creator>Yanhao Zhu</dc:creator><description>What a wonderful tool! Thanks a lot!</description></item><item><title>HTML Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#95467</link><pubDate>Wed, 24 Mar 2004 22:41:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:95467</guid><dc:creator>Phil Denoncourt's Blog</dc:creator><description /></item><item><title>HTML Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#95468</link><pubDate>Wed, 24 Mar 2004 22:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:95468</guid><dc:creator>Phil Denoncourt's Blog</dc:creator><description /></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#95784</link><pubDate>Thu, 25 Mar 2004 15:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:95784</guid><dc:creator>Crumpy</dc:creator><description>I'm kind of new to XHTML and XSLT.  I wonder if this tool can help me with a dilema though.  I don't want to use MSHTML and the COM interop with my already speedy C# app that crawls and extracts information from the web (the performance would be affected too greatly).  I've run into recent problems though trying to follow links that call javascript to build dynamic links or set hidden variables before submitting a form etc..&lt;br&gt;&lt;br&gt;I want to be able to somehow convert those references into &amp;quot;synthetic hyperlinks&amp;quot; similar to the process described in IBM's description found at &lt;a target="_new" href="http://www10.org/cdrom/papers/102/"&gt;http://www10.org/cdrom/papers/102/&lt;/a&gt; .  They seem to be using XHTML and XSLT to do this somehow.  Can I somehow execute scripts found in the XHTML using XSLT?  By some other means?  I'm really at a loss here.&lt;br&gt;&lt;br&gt;Thanks for the awesome tool by the way!</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#95802</link><pubDate>Thu, 25 Mar 2004 16:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:95802</guid><dc:creator>Charlie</dc:creator><description>Wow, hard to believe it took so long for somebody to write and give away such an awesome tool.&lt;br&gt;Thanks!</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#95823</link><pubDate>Thu, 25 Mar 2004 16:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:95823</guid><dc:creator>Simon Mourier</dc:creator><description>The Html Agility Pack allows you to use XSLT on HTML document it loads. Note, however, that it does not even relies on XHTML format at all. HTML documents do not need to conform to anything but HTML &amp;quot;as we know it in the real world&amp;quot; :-)&lt;br&gt;&lt;br&gt;So, yes, I believe you can use the method described &lt;a target="_new" href="http://www10.org/cdrom/papers/102/"&gt;http://www10.org/cdrom/papers/102/&lt;/a&gt; to determine dynamic hyperlinks.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#96642</link><pubDate>Fri, 26 Mar 2004 16:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:96642</guid><dc:creator>Crumpy</dc:creator><description>Simon, I can append and prepend new HtmlNodes int the HtmlDocument just as I do using XmlElements in an XmlDocument.  However, when I save the modified HtmlDocument object, there are no line-breaks between the new HtmlNodes I inserted.  If I enter 10 new nodes, they're all on the same line in the saved document.&lt;br&gt;&lt;br&gt;Just thought I'd let you know.  I'll browse the code.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#96660</link><pubDate>Fri, 26 Mar 2004 16:57:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:96660</guid><dc:creator>Simon Mourier</dc:creator><description>Hi crumpy, this is all by design :-) Nothing is inserted automatically by the Html Agility Pack.&lt;br&gt;&lt;br&gt;Here is a code snippet that shows you how to insert a line break before and after a node (there are many ways to do it actually...)&lt;br&gt;&lt;br&gt;static void Main(string[] args)&lt;br&gt;{&lt;br&gt; HtmlDocument doc = new HtmlDocument();&lt;br&gt; doc.DocumentNode.AppendChild(HtmlNode.CreateNode(&amp;quot;&amp;lt;html&amp;gt;&amp;lt;/html&amp;gt;&amp;quot;));&lt;br&gt; HtmlNode bodyNode = doc.CreateElement(&amp;quot;body&amp;quot;);&lt;br&gt; doc.DocumentNode.FirstChild.AppendChild(bodyNode);&lt;br&gt; AddOuterLineBreaks(bodyNode);&lt;br&gt; doc.Save(Console.Out);&lt;br&gt;}&lt;br&gt;&lt;br&gt;static void AddOuterLineBreaks(HtmlNode node)&lt;br&gt;{&lt;br&gt; if (node.ParentNode == null)&lt;br&gt;  return;&lt;br&gt;&lt;br&gt; node.ParentNode.InsertBefore(HtmlNode.CreateNode(&amp;quot;\r\n&amp;quot;), node);&lt;br&gt; node.ParentNode.InsertAfter(HtmlNode.CreateNode(&amp;quot;\r\n&amp;quot;), node);&lt;br&gt;}&lt;br&gt;&lt;br&gt;Simon.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#97035</link><pubDate>Sat, 27 Mar 2004 05:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:97035</guid><dc:creator>Crumpy</dc:creator><description>Sorry to keep bothering you Simon, thanks for your help.&lt;br&gt;&lt;br&gt;This is the first time I've used XPath to navigate anything and I'm using it to navigate HtmlNodes using the HtmlNode.SelectNodes() function.&lt;br&gt;&lt;br&gt;I'm having a problem with the current context, for example.  I've created and filled an HtmlDocument which contains forms. I then obtain a HtmlNodeCollection of the form nodes, then for each form node I attempt to obtain a collection of input nodes that are a descendent of that form node:&lt;br&gt;&lt;br&gt;HtmlNOdeCollection forms = doc.DocumentNode.SelectNodes(&amp;quot;//form&amp;quot;);&lt;br&gt;&lt;br&gt;foreach( HtmlNode formNode in forms )&lt;br&gt;{&lt;br&gt;   HtmlNodeCollection inputControls = formNode.SelectNodes(&amp;quot;.//input&amp;quot;);&lt;br&gt;&lt;br&gt;   foreach( HtmlNode inputControl in inputControls )&lt;br&gt;   {&lt;br&gt;      ...&lt;br&gt;   }&lt;br&gt;}&lt;br&gt;&lt;br&gt;The XPath expression &amp;quot;.//input&amp;quot; should return an HtmlNodeCollection containing any input nodes within the form (the '.' specifying the current context, or the current selected node - from what I understand).  But I always get back null.&lt;br&gt;&lt;br&gt;If I change the expression to &amp;quot;//input&amp;quot; (which should return all input nodes beginning the search from the root node of the document) returns all of the input nodes found in the document (which is correct).&lt;br&gt;&lt;br&gt;However, I specifically need just the input nodes within the current form node.&lt;br&gt;&lt;br&gt;What am I doing wrong?&lt;br&gt;&lt;br&gt;I've been testing this against &lt;a target="_new" href="https://recruitmax.alltel.com/recruitmax/candidates/jobopps.cfm"&gt;https://recruitmax.alltel.com/recruitmax/candidates/jobopps.cfm&lt;/a&gt; which happens to have 2 forms.&lt;br&gt;&lt;br&gt;Thanks again!</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#101079</link><pubDate>Mon, 29 Mar 2004 22:12:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:101079</guid><dc:creator>Simon Mourier</dc:creator><description>Hi Crumpy, you really are the &amp;quot;out of luck&amp;quot; guy :-) let me explain why. The &amp;lt;form&amp;gt; element deserves, by default, a special treatment by Html Agility Pack: it can overlap. It means you can have HTML like this: &amp;lt;form&amp;gt;&amp;lt;b&amp;gt;&amp;lt;/form&amp;gt;&amp;lt;/b&amp;gt;, and Html Agility Pack will not report any error and will save it just like that. But it is more a trick than anything else because the &amp;lt;form&amp;gt; node in the DOM does not contain any node, it is declared as empty, and the &amp;lt;/form&amp;gt; is declared as a text node with a value of &amp;quot;&amp;lt;/form&amp;gt;&amp;quot;... This is why you find nothing inside the &amp;lt;form&amp;gt; element.&lt;br&gt;&lt;br&gt;You can change the parsing behavior of the Html Agility Pack, using the HtmlNode static property called ElementFlags: just add the following code before you parse your texte:&lt;br&gt;&lt;br&gt;HtmlNode.ElementFlags.Remove(&amp;quot;form&amp;quot;);&lt;br&gt;&lt;br&gt;and you should see the &amp;lt;input&amp;gt; elements inside the &amp;lt;form&amp;gt; elements, just like you thought. Note, however, that &amp;lt;form&amp;gt; elements will not be able to overlap any more if you do this. Without adding this code, you could also fix a complex xpath to find inputs as children of form siblings.&lt;br&gt;&lt;br&gt;Simon.&lt;br&gt;&lt;br&gt;</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#104678</link><pubDate>Wed, 31 Mar 2004 19:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:104678</guid><dc:creator>m_prog</dc:creator><description>I have started example HtmlToRss and there is a mistake &amp;quot; File was not found at cache path... &amp;quot; In cache there is no necessary file. How to cope with it? What file there should be? Where file should enter the name there?</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#104713</link><pubDate>Wed, 31 Mar 2004 20:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:104713</guid><dc:creator>Simon Mourier</dc:creator><description>In html2rss.cs, you find this:&lt;br&gt;&lt;br&gt;// set the following to true, if you don't want to use the Internet at all and if you are sure something is available in the cache (for testing purposes for example).&lt;br&gt;hw.CacheOnly = true;&lt;br&gt;&lt;br&gt;It means we really look for a file in the cache directory. if it's not there, an exception is thrown.&lt;br&gt;&lt;br&gt;Just set CacheOnly to false (at least the 1st time you run html2rss.exe) and recompile.&lt;br&gt;&lt;br&gt;Simon.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#108924</link><pubDate>Wed, 07 Apr 2004 17:27:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:108924</guid><dc:creator>Gerard Kappen</dc:creator><description>Hi Simon, there is a slight mistake in the HtmlNode constructore where a &amp;quot;form&amp;quot; tag gets an HtmlElementFlag.Empty. So instead of my inputs being childs of the form they are parsed as being siblings. The fix is easy of course just change &amp;lt;code&amp;gt;ElementsFlags.Add(&amp;quot;form&amp;quot;, HtmlElementFlag.CanOverlap | HtmlElementFlag.Empty);&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ElementsFlags.Add(&amp;quot;form&amp;quot;, HtmlElementFlag.CanOverlap);&amp;lt;/code&amp;gt;&lt;br&gt;&lt;br&gt;</description></item><item><title>Memory optimization</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#110238</link><pubDate>Fri, 09 Apr 2004 13:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:110238</guid><dc:creator>Charlie</dc:creator><description>Hey Simon, &lt;br&gt;&lt;br&gt;I'm wondering if you've thought about creating a slimmed down version of this toolkit.  Maybe making the dom forward only, or being able to conditionally turn off some of the internal variables like _line and _lineposition.  &lt;br&gt;&lt;br&gt;Anyway, just a thought for future improvements to this great toolkit.&lt;br&gt;-Charlie&lt;br&gt;</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#110776</link><pubDate>Sat, 10 Apr 2004 17:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:110776</guid><dc:creator>Simon Mourier</dc:creator><description>Duh, I have no thought about that yet... Maybe if it becomes a commercial package :-) I have too much work to do right now.&lt;br&gt;&lt;br&gt;BTW, I am not sure line and lineposition are the ones that really eat memory? strings (names, values, ...) are probably more important in that area, even thought they are only lazily allocated (only when requested).&lt;br&gt;&lt;br&gt;I suppose if I had time to rewrite it, I would probably focus on string handling. For example, use NameTable and compare references rather than values (just like the Xml parser does with XmlNameTable).&lt;br&gt;&lt;br&gt;Simon.</description></item><item><title>HtmlNode.InnerText Property not settable</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#112958</link><pubDate>Wed, 14 Apr 2004 19:21:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:112958</guid><dc:creator>monosodiumg</dc:creator><description>In the chm, the description is:&lt;br&gt;Gets or Sets the text between the start and end tags of the object. &lt;br&gt;&lt;br&gt;The declaration on that page is:&lt;br&gt;public virtual string InnerText {get;}&lt;br&gt;&lt;br&gt;The observed behaviour is as per declaration.&lt;br&gt;&lt;br&gt;Why is the InnertText not settable?&lt;br&gt;</description></item><item><title>re: .NET Html Agility Pack: How to strip all comments?</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#113417</link><pubDate>Thu, 15 Apr 2004 06:21:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:113417</guid><dc:creator>Sam V</dc:creator><description>Hi there, just a simple question.  I appologize if I'm being a little dense, but how could you strip all the comment nodes? I tried to go over all nodes and look for a node of type 'comment' but I can't seem to be able to do this?&lt;br&gt;&lt;br&gt;Thanks!&lt;br&gt;-Sam</description></item><item><title>re: .NET Html Agility Pack: How to strip all comments</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#113434</link><pubDate>Thu, 15 Apr 2004 06:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:113434</guid><dc:creator>Simon Mourier</dc:creator><description>This is a sample code to remove comments:&lt;br&gt;&lt;br&gt;static void Main(string[] args)&lt;br&gt;{&lt;br&gt; HtmlDocument doc = new HtmlDocument();&lt;br&gt; doc.Load(&amp;quot;filewithcomments.htm&amp;quot;);&lt;br&gt; doc.Save(Console.Out); // show before&lt;br&gt; RemoveComments(doc.DocumentNode);&lt;br&gt; doc.Save(Console.Out); // show after&lt;br&gt;}&lt;br&gt;&lt;br&gt;static void RemoveComments(HtmlNode node)&lt;br&gt;{&lt;br&gt; if (node.NodeType == HtmlNodeType.Comment)&lt;br&gt; {&lt;br&gt;  node.ParentNode.RemoveChild(node);&lt;br&gt;  return;&lt;br&gt; }&lt;br&gt; if (!node.HasChildNodes)&lt;br&gt;  return;&lt;br&gt; foreach(HtmlNode subNode in node.ChildNodes)&lt;br&gt; {&lt;br&gt;  RemoveComments(subNode);&lt;br&gt; }&lt;br&gt;}&lt;br&gt;</description></item><item><title>re: HtmlNode.InnerText Property not settable </title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#113435</link><pubDate>Thu, 15 Apr 2004 06:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:113435</guid><dc:creator>Simon Mourier</dc:creator><description>You cannot set innerText by design because it's computed and the doc is wrong as you noticed.&lt;br&gt;&lt;br&gt;You can set innerHtml.&lt;br&gt;&lt;br&gt;Simon.</description></item><item><title>re: ..NET HTML Agility Pack: How to strip all comments</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#113442</link><pubDate>Thu, 15 Apr 2004 07:00:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:113442</guid><dc:creator>Sam V</dc:creator><description>I seem to have answered my own question!&lt;br&gt;Here is source, in case anyone wants it.&lt;br&gt;Thanks&lt;br&gt;-Sam&lt;br&gt;&lt;br&gt;        Dim myNodes As HtmlAgilityPack.HtmlNodeCollection = myDoc.DocumentNode.SelectNodes(&amp;quot;//comment()&amp;quot;)&lt;br&gt;        Dim node As HtmlAgilityPack.HtmlNode&lt;br&gt;&lt;br&gt;        For Each node In myNodes&lt;br&gt;            Console.Write(node.NodeType)&lt;br&gt;            If node.NodeType = HtmlAgilityPack.HtmlNodeType.Comment Then&lt;br&gt;                node.ParentNode.RemoveChild(node)&lt;br&gt;            End If&lt;br&gt;        Next</description></item><item><title>3rd party mod</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#114080</link><pubDate>Fri, 16 Apr 2004 04:03:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:114080</guid><dc:creator>Mark</dc:creator><description>Hey Simon, &lt;br&gt;&lt;br&gt;Just thought I'd pass on a tweak I made in case you or anyone else thought it was a useful mod.&lt;br&gt;&lt;br&gt;I added the following to the HtmlNode class so as I'm doing whatever to the nodes I find, I can optionally hang any object off the nodes for re-use later.  &lt;br&gt;&lt;br&gt;-Mark&lt;br&gt;-------------------&lt;br&gt;&lt;br&gt;internal object _externalobject = null;&lt;br&gt;&lt;br&gt;/// &amp;lt;summary&amp;gt;&lt;br&gt;/// Gets or Sets the external object associated with the node.&lt;br&gt;/// &amp;lt;/summary&amp;gt;&lt;br&gt;public object ExternalObject {&lt;br&gt;	get {&lt;br&gt;		return _externalobject;&lt;br&gt;	}&lt;br&gt;	set {&lt;br&gt;		_externalobject = value;&lt;br&gt;	}&lt;br&gt;}&lt;br&gt;</description></item><item><title>HtmlNode.cs:1602  bug</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#116193</link><pubDate>Tue, 20 Apr 2004 05:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:116193</guid><dc:creator>benles@bldigital.com</dc:creator><description>It calls HtmlEncode on Html text, thus encoding twice, producing output like &lt;br&gt;&amp;amp;amp;nbsp; </description></item><item><title>HtmlComponent</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#117675</link><pubDate>Wed, 21 Apr 2004 22:17:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:117675</guid><dc:creator>.NET Blog - Chris Frazier Style</dc:creator><description /></item><item><title>HtmlComponent</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#117676</link><pubDate>Wed, 21 Apr 2004 22:18:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:117676</guid><dc:creator>Blue Phoenix</dc:creator><description /></item><item><title>Html Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#148173</link><pubDate>Fri, 04 Jun 2004 08:19:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:148173</guid><dc:creator>{Arne Janning .NET}</dc:creator><description /></item><item><title>Html Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#148174</link><pubDate>Fri, 04 Jun 2004 08:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:148174</guid><dc:creator>{Arne Janning .NET}</dc:creator><description /></item><item><title>re: XPath selection on attributes?</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#152948</link><pubDate>Fri, 11 Jun 2004 05:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:152948</guid><dc:creator>Mark</dc:creator><description>Hi...&lt;br&gt;&lt;br&gt;I just started playing with HtmlAgility today and I noticed a couple of odd things - most significant was with the results of some xpath queries.&lt;br&gt;&lt;br&gt;I was using the xpath query &amp;quot;//base/@href&amp;quot; (i.e. intending to select an attribute value from the &amp;lt;base&amp;gt; tag if found.  What I got back was an odd HtmlNodeNavigator that had LocalName set to &amp;quot;href&amp;quot; and Name set to &amp;quot;base&amp;quot; (i.e. kind of an odd mashing of the parent node with the attribute node).  When I get .Current, i get the parent &amp;lt;base&amp;gt; node.&lt;br&gt;&lt;br&gt;I don't know how easy it would be, but perhaps HtmlAttribute could be recoded to be a derivation of of HtmlNode?  Seems like it would be easier to emulate xml behavior if they were interchangeable...&lt;br&gt;&lt;br&gt;Thanks&lt;br&gt;-mark&lt;br&gt;</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#153316</link><pubDate>Fri, 11 Jun 2004 16:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:153316</guid><dc:creator>Simon Mourier</dc:creator><description>Hi Mark. You are absolutely right. This is a design error, you cannot use attributes in path selection. You still can use it in filters though, like //base[@href]. This would require to change the HtmlNodeNavigator.cs file ... and I have no time to fix it right now :-)</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#158954</link><pubDate>Fri, 18 Jun 2004 13:30:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:158954</guid><dc:creator>Ian</dc:creator><description>Simon, thanks for this great utility.&lt;br&gt;&lt;br&gt;I haven't seen a way to POST data to a site and create a document, am I missing something?&lt;br&gt;&lt;br&gt;Also is that a typo in the download link or is it deliberate?&lt;br&gt;&lt;br&gt;Ian</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#159059</link><pubDate>Fri, 18 Jun 2004 16:32:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:159059</guid><dc:creator>Simon Mourier</dc:creator><description>If you talk about the HtmlWeb class, you can pass a method (POST or anything) to the LoadUrl. You can also hook the HttpRequest that will be used if you connect to the PreRequest event. You can tweak the method (or anything else) here as well.&lt;br&gt;Simon.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#159507</link><pubDate>Sat, 19 Jun 2004 03:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:159507</guid><dc:creator>Ian</dc:creator><description>Ahh I see the light! I saw the method arguments but not the event handlers. Thank you. Ian.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#161834</link><pubDate>Tue, 22 Jun 2004 08:26:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:161834</guid><dc:creator>Mark</dc:creator><description>Hey Simon, &lt;br&gt;I ran into an issue with html comments today.  I'm trying to insert an html comment in the document and it is requiring me to put the &amp;quot;&amp;lt;!--&amp;quot; and &amp;quot;--&amp;gt;&amp;quot; wrappers on the value I set for the HtmlCommentNode.  Debugging thru the code it appears that the nodes that are generated by the parse routine incorrectly include those wrapper tags in the value of the node...causing node.OuterHtml to return &lt;br&gt;&amp;quot;&amp;lt;!-- &amp;lt;!-- value --&amp;gt; --&amp;gt;&amp;quot; and node.InnerHtml to return &amp;quot;&amp;lt;!-- value --&amp;gt;&amp;quot;.&lt;br&gt;&lt;br&gt;-Mark</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#164684</link><pubDate>Thu, 24 Jun 2004 22:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:164684</guid><dc:creator>Simon Mourier</dc:creator><description>Hi Mark.&lt;br&gt;Can you show a sample of your code?&lt;br&gt;Simon.</description></item><item><title>remove word tags</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#169095</link><pubDate>Wed, 30 Jun 2004 05:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:169095</guid><dc:creator>magnum</dc:creator><description>based on examples related to remove tags with htmlagility pack, how can i remove tags like:&lt;br&gt;&lt;br&gt;&amp;lt;o:p&amp;gt;&lt;br&gt;&lt;br&gt;triyng: RemoveTag(doc, &amp;quot;o\\\\:p&amp;quot;); but it returns &amp;quot;System.Xml.XPath.XPathException:&amp;quot;&lt;br&gt;&lt;br&gt;&lt;br&gt;private static void RemoveTag(HtmlDocument doc, string tags)&lt;br&gt;		{&lt;br&gt;HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes(&amp;quot;//&amp;quot; + tags);&lt;br&gt;if (nodes == null)&lt;br&gt;return;&lt;br&gt;foreach(HtmlNode node in nodes){&lt;br&gt;if (node.ParentNode != null)&lt;br&gt;	node.ParentNode.RemoveChild(node);&lt;br&gt;			}&lt;br&gt;		}</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#169568</link><pubDate>Wed, 30 Jun 2004 17:08:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:169568</guid><dc:creator>Simon Mourier</dc:creator><description>Hi.&lt;br&gt;Unfortunately, the support for namespaces is limited in the Html Agility Pack. It does not really know what a namespace is and understands names (prefix ':' localname) as a whole. I agree this is quite confusing :-) but most of the time, you can work around it. In your case, this is how you would do it.&lt;br&gt;&lt;br&gt;HtmlNodeCollection coll = doc.DocumentNode.SelectNodes(&amp;quot;//*[name() ='o:p']&amp;quot;);&lt;br&gt;foreach(HtmlNode node in coll)&lt;br&gt;{&lt;br&gt;  node.ParentNode.RemoveChild(node);&lt;br&gt;}&lt;br&gt;Simon.&lt;br&gt;</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#175754</link><pubDate>Thu, 08 Jul 2004 06:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:175754</guid><dc:creator>raoul ellias</dc:creator><description>My parsed file ends up having attributes like nowrap set to nowrap=&amp;quot;&amp;quot; or checked set to checked=&amp;quot;&amp;quot;. Is there something Im missing?&lt;br&gt;&lt;br&gt;Thanks a lot, and this is an awesome tool.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#180443</link><pubDate>Mon, 12 Jul 2004 17:14:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:180443</guid><dc:creator>Simon Mourier</dc:creator><description>Why is this an issue? (originally this was due to plans for XHTML compatibility, and it also helps for XML output)&lt;br&gt;&lt;br&gt;Browsers should not choke on it?&lt;br&gt;Simon.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#181767</link><pubDate>Tue, 13 Jul 2004 23:36:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:181767</guid><dc:creator>raoul</dc:creator><description>Thanks, yes the browser does not choke, Im just a little anal. &lt;br&gt;&lt;br&gt;Ive been using this on a huge project to insert .net id's and validators for input fields based on database types. Its a lot of html and I estimate Im saving atleast 1/2 hr a page.&lt;br&gt;&lt;br&gt;Thanks!!</description></item><item><title>Possible bugs ?</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#186057</link><pubDate>Sat, 17 Jul 2004 20:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:186057</guid><dc:creator>Markus</dc:creator><description>Hi,&lt;br&gt;&lt;br&gt;it seems your sample in the beginning doesnt work anymore:&lt;br&gt;&lt;br&gt;HtmlDocument doc = new HtmlDocument();&lt;br&gt;doc.Load(&amp;quot;file.htm&amp;quot;);&lt;br&gt;foreach(HtmlNode link in doc.DocumentElement.SelectNodes(&amp;quot;//a[@href&amp;quot;])&lt;br&gt;{&lt;br&gt;	HtmlAttribute att = link[&amp;quot;href&amp;quot;];&lt;br&gt;	att.Value = FixLink(att);&lt;br&gt;}&lt;br&gt;doc.Save(&amp;quot;file.htm&amp;quot;);&lt;br&gt;&lt;br&gt;1. DocumentElement is Replaced by DocumentNode&lt;br&gt;2. HtmlNode is not indexable anymore (link[&amp;quot;href&amp;quot;] won't work&lt;br&gt;3. I tried HtmlNode.GetAttributeValue() and HtmlNode SetAttributeValue but after saving the Document with doc.Save() there weren't any changes.&lt;br&gt;&lt;br&gt;Here is my Code i used:&lt;br&gt;&lt;br&gt;HtmlDocument doc = hw.Load (&amp;quot;f1.htm&amp;quot;); &lt;br&gt;HtmlNode hn = doc.DocumentNode.SelectSingleNode (&amp;quot;//body&amp;quot;); &lt;br&gt;hn.SetAttributeValue (&amp;quot;new&amp;quot;,&amp;quot;value&amp;quot;); &lt;br&gt;doc.Save ( &amp;quot;f2.htm&amp;quot;); &lt;br&gt;&lt;br&gt;&lt;br&gt;Please correct me if i'm wrong&lt;br&gt;&lt;br&gt;greetings&lt;br&gt;&lt;br&gt;Markus</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#187124</link><pubDate>Mon, 19 Jul 2004 17:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:187124</guid><dc:creator>Simon Mourier</dc:creator><description>Hi Markus, you're absolutely right. The sample (which was meant for illustration purpose only) is wrong, and it has always been. You're the first one to really try it I suppose :-)&lt;br&gt;&lt;br&gt;The samples in the .zip file are hopefully ok, though.&lt;br&gt;&lt;br&gt;Simon.</description></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#207851</link><pubDate>Wed, 04 Aug 2004 21:11:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:207851</guid><dc:creator>Salman</dc:creator><description>Very nifty tool.  Thanks!</description></item><item><title>re: html parser for WinFS</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#210834</link><pubDate>Sun, 08 Aug 2004 17:27:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:210834</guid><dc:creator>Sean Grimaldi WebLog</dc:creator><description /></item><item><title>re: .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#212310</link><pubDate>Wed, 11 Aug 2004 07:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:212310</guid><dc:creator>Bobstar</dc:creator><description>Hello... &lt;br&gt;Im getting an error when loading the solution. It's missing:&lt;br&gt;..\HtmlDomView\HtmlDomView.csproj&lt;br&gt;and&lt;br&gt;Samples\GetBinaryRemainder\GetBinaryRemainder.csproj&lt;br&gt;&lt;br&gt;I appears they are not in the zip-file :O(&lt;br&gt; Please help</description></item><item><title>My Favorite .Net Tools</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#218189</link><pubDate>Sat, 21 Aug 2004 16:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:218189</guid><dc:creator>www.ilovedotnet.co.uk</dc:creator><description /></item><item><title>HTML Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#220111</link><pubDate>Wed, 25 Aug 2004 13:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:220111</guid><dc:creator>Code/Tea/Etc.</dc:creator><description /></item><item><title>re: Using a Regular Expression to Match HTML</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#247959</link><pubDate>Tue, 26 Oct 2004 20:03:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:247959</guid><dc:creator>you've been HAACKED</dc:creator><description /></item><item><title>Why Not Convert HTML to XML?</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#247974</link><pubDate>Tue, 26 Oct 2004 20:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:247974</guid><dc:creator>you've been HAACKED</dc:creator><description /></item><item><title>Formating HTML to valid XML..</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#258488</link><pubDate>Wed, 17 Nov 2004 00:25:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:258488</guid><dc:creator>guyS's WebLog</dc:creator><description /></item><item><title>RE: IBlogExtension</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#330176</link><pubDate>Thu, 23 Dec 2004 00:34:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:330176</guid><dc:creator>.NET Blog - Chris Frazier Style</dc:creator><description /></item><item><title>.NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#330265</link><pubDate>Thu, 23 Dec 2004 02:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:330265</guid><dc:creator>Wayne Allen's Weblog</dc:creator><description /></item><item><title>HTML Parser Terror</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#361017</link><pubDate>Wed, 26 Jan 2005 23:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:361017</guid><dc:creator>blog.codespace.de</dc:creator><description /></item><item><title>HTML Agility Pack: Less Tedium</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#363902</link><pubDate>Mon, 31 Jan 2005 20:46:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:363902</guid><dc:creator>Mike's Blog</dc:creator><description>Processing loosely-defined text must rank as the one of &lt;br&gt;the worst kinds of pro</description></item><item><title>Changing a string into well formatted XML</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#369059</link><pubDate>Tue, 08 Feb 2005 16:54:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:369059</guid><dc:creator>Jaco Pretorius</dc:creator><description /></item><item><title>H.I.T Day 51 More About html2text</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#406769</link><pubDate>Sat, 09 Apr 2005 07:44:52 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406769</guid><dc:creator>Forever Blue Baggio</dc:creator><description /></item><item><title>.NET UI Components, Logging &amp;amp;amp; More...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#406913</link><pubDate>Sun, 10 Apr 2005 19:26:40 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406913</guid><dc:creator>Venkatarangan's Blog [வெங்கடரங்கன் வலைப்பதிவு]</dc:creator><description /></item><item><title>.NET UI Components, Logging &amp;amp;amp; More...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#406914</link><pubDate>Sun, 10 Apr 2005 19:30:10 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406914</guid><dc:creator>Venkatarangan's Blog [வெங்கடரங்கன் வலைப்பதிவு]</dc:creator><description /></item><item><title>.NET UI Components, Logging &amp;amp;amp; More...</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#406917</link><pubDate>Sun, 10 Apr 2005 19:51:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:406917</guid><dc:creator>Venkatarangan's Blog [வெங்கடரங்கன் வலைப்பதிவு]</dc:creator><description /></item><item><title>HTML Agility Pack : Less Tedium</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#452950</link><pubDate>Thu, 18 Aug 2005 08:01:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:452950</guid><dc:creator>Mike Bridge's Blog</dc:creator><description>&amp;amp;lt;p&amp;amp;gt;Processing loosely-defined text must rank as the one of &lt;br&gt;the worst kinds of programming tasks.  HTML and CSV parsing&lt;br&gt;are about as much fun as cleaning the toilet in a bus &lt;br&gt;station—who knows what you're going to find.&amp;amp;lt;/p&amp;amp;gt;</description></item><item><title>HTML Agility Pack: Less Tedium</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#453213</link><pubDate>Thu, 18 Aug 2005 22:35:18 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:453213</guid><dc:creator>Mike Bridge's Blog</dc:creator><description>&amp;amp;lt;p&amp;amp;gt;Processing loosely-defined text must rank as the one of &lt;br&gt;the worst kinds of programming tasks.  HTML and CSV parsing&lt;br&gt;are about as much fun as cleaning the toilet in a bus &lt;br&gt;station—who knows what you're going to find.&amp;amp;lt;/p&amp;amp;gt;</description></item><item><title>HtmlComponent</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#545499</link><pubDate>Tue, 07 Mar 2006 21:56:33 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:545499</guid><dc:creator>Blue Fenix</dc:creator><description /></item><item><title>HtmlComponent</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#545518</link><pubDate>Tue, 07 Mar 2006 22:16:33 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:545518</guid><dc:creator>Christopher</dc:creator><description /></item><item><title>Fun parsing microformats out of html</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#571644</link><pubDate>Sat, 08 Apr 2006 23:15:22 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:571644</guid><dc:creator>VB-tech weblog</dc:creator><description /></item><item><title>Fun parsing microformats out of html</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#571652</link><pubDate>Sat, 08 Apr 2006 23:32:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:571652</guid><dc:creator>VB-tech weblog</dc:creator><description /></item><item><title>HtmlAgilityPack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#612986</link><pubDate>Thu, 01 Jun 2006 17:17:47 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:612986</guid><dc:creator>Dan Miser</dc:creator><description /></item><item><title>HTML Agility Pack</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#794246</link><pubDate>Thu, 05 Oct 2006 20:29:34 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:794246</guid><dc:creator>Duncan Mackenzie .Net</dc:creator><description>&lt;p&gt;I've seen this around before, and this post was from June 2003, but it is worth mentioning again!&lt;/p&gt;
</description></item><item><title> &amp;raquo; .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML&amp;#8230; MSDN Blog Feed</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#7085071</link><pubDate>Sat, 12 Jan 2008 08:48:35 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:7085071</guid><dc:creator> » .NET Html Agility Pack: How to use malformed HTML just like it was well-formed XML… MSDN Blog Feed</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://msdn.blogsforu.com/msdn/?p=3654"&gt;http://msdn.blogsforu.com/msdn/?p=3654&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>Blogs and RSS &amp;raquo; Simon Mourier&amp;#8217;s WebLog : .NET Html Agility Pack: How to use malformed &amp;#8230;</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#8299431</link><pubDate>Tue, 18 Mar 2008 03:26:24 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8299431</guid><dc:creator>Blogs and RSS » Simon Mourier’s WebLog : .NET Html Agility Pack: How to use malformed …</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://blogrssblog.info/simon-mouriers-weblog-net-html-agility-pack-how-to-use-malformed/"&gt;http://blogrssblog.info/simon-mouriers-weblog-net-html-agility-pack-how-to-use-malformed/&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>Html Agility Pack &amp;laquo; Alexander The Great</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#9186528</link><pubDate>Tue, 09 Dec 2008 09:42:19 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9186528</guid><dc:creator>Html Agility Pack &amp;laquo; Alexander The Great</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://alexandersarchive.wordpress.com/2008/12/08/html-agility-pack/"&gt;http://alexandersarchive.wordpress.com/2008/12/08/html-agility-pack/&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>Supression des balise HTML (expression reguliere) | hilpers</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#9368119</link><pubDate>Thu, 22 Jan 2009 17:58:06 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9368119</guid><dc:creator>Supression des balise HTML (expression reguliere) | hilpers</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://www.hilpers.fr/931621-supression-des-balise-html-expression"&gt;http://www.hilpers.fr/931621-supression-des-balise-html-expression&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>WebRequest some site url meeting : The remote server returned an error: (403)</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#9589685</link><pubDate>Wed, 06 May 2009 00:02:48 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9589685</guid><dc:creator>zc0000</dc:creator><description>&lt;p&gt;Avoid (403) Forbidden errors when using HttpWebRequest I had an error when tried to open the page http&lt;/p&gt;
</description></item><item><title> Simon Mourier s WebLog NET Html Agility Pack How to use malformed | Paid Surveys</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#9658528</link><pubDate>Sat, 30 May 2009 00:30:22 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9658528</guid><dc:creator> Simon Mourier s WebLog NET Html Agility Pack How to use malformed | Paid Surveys</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://paidsurveyshub.info/story.php?title=simon-mourier-s-weblog-net-html-agility-pack-how-to-use-malformed"&gt;http://paidsurveyshub.info/story.php?title=simon-mourier-s-weblog-net-html-agility-pack-how-to-use-malformed&lt;/a&gt;&lt;/p&gt;
</description></item><item><title> Simon Mourier s WebLog NET Html Agility Pack How to use malformed | Cellulite Creams</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#9712274</link><pubDate>Tue, 09 Jun 2009 06:06:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9712274</guid><dc:creator> Simon Mourier s WebLog NET Html Agility Pack How to use malformed | Cellulite Creams</dc:creator><description>&lt;p&gt;PingBack from &lt;a rel="nofollow" target="_new" href="http://cellulitecreamsite.info/story.php?id=4212"&gt;http://cellulitecreamsite.info/story.php?id=4212&lt;/a&gt;&lt;/p&gt;
</description></item><item><title>自动向网页Post信息并提取返回的信息</title><link>http://blogs.msdn.com/smourier/archive/2003/06/04/8265.aspx#9794596</link><pubDate>Sun, 21 Jun 2009 08:10:42 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9794596</guid><dc:creator>过世许久</dc:creator><description>&lt;p&gt;转载自:&lt;a rel="nofollow" target="_new" href="http://www.cnblogs.com/dragon/archive/2005/06/15/174946.html"&gt;http://www.cnblogs.com/dragon/archive/2005/06/15/174946.html&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;示例下载&lt;/p&gt;
&lt;p&gt;朋友问到这样一个问题，需要实现如下功能&lt;/p&gt;
&lt;p&gt;1、&lt;/p&gt;
</description></item></channel></rss>