<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx</link><description>This post is for those of you interested in learning the basics behind WordprocessingML. That’s the schema that we built for Word 2003. You can save any Word document as XML, and we will use this schema to fully represent that document as XML. The new</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>Intro to Word XML Part 2: Simple Formatting</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#1866333</link><pubDate>Mon, 12 Mar 2007 20:02:10 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1866333</guid><dc:creator>Brian Jones: Open XML Formats</dc:creator><description>&lt;p&gt;If you read Part 1 of the Word XML Introduction, you saw the basics behind a Word document, as well as&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1866333" width="1" height="1"&gt;</description></item><item><title>Intro to Word XML Part 3: Using Your Own Schema</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#1812355</link><pubDate>Tue, 06 Mar 2007 00:38:17 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1812355</guid><dc:creator>Brian Jones: Open XML Formats</dc:creator><description>&lt;p&gt;When we built the support for customer defined schemas into Word 2003 there were a couple scenarios we&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1812355" width="1" height="1"&gt;</description></item><item><title>re: Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#618921</link><pubDate>Tue, 06 Jun 2006 12:30:28 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:618921</guid><dc:creator>Mahesh</dc:creator><description>Hi Sir,&lt;br&gt;Does it possible in wordprocessing ml to insert the any other word document content to wordml target file?&lt;br&gt;I will have xml file in which will have the url for any other word document file. using xslt i will convert to wrodml format but not able to get how to insert the content of any other document file.&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=618921" width="1" height="1"&gt;</description></item><item><title>re: Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#581995</link><pubDate>Mon, 24 Apr 2006 11:09:48 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:581995</guid><dc:creator>Robin Joseph Varughese</dc:creator><description>Sir,&lt;br&gt;I have one doubt.I attached a schema to a word document and saved the document as xml file.So far so good.but when a tried to open the copied word document from a &amp;nbsp;new system , the URI and alias name shown as unavailable.Is there any way to keep the schema along with the word document where ever I open the word document,Pls help me to solve this issue.&lt;br&gt;&lt;br&gt;Thnaks in advance&lt;br&gt;&lt;br&gt;Robin Joseph Varughese&lt;br&gt;robinjoseph.v@arisglobal.co.in&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=581995" width="1" height="1"&gt;</description></item><item><title>re: Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#479763</link><pubDate>Tue, 11 Oct 2005 22:57:42 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:479763</guid><dc:creator>I'm Brian Jones</dc:creator><description>Actually, there are a number of reasons it was done this way and it has a number of benefits. The first benefit is that there is no mixed content. There is just a flat list of runs that have properties associated with them. When you get into the mixed content world it actually can become really ugly if there are other complex structures your application supports. For example, when you try to mix in additional structures like custom defined schema you need to do a lot of work to maintain well formedness. You need to know how many different tags you should close out before you write the start tag for the custom structure, and then do the same when you write the closing tag. For us, it's always the same procedure, the single run tag is closed, and the custom schema tag is then inserted.&lt;br&gt;&lt;br&gt;One of the main reasons we do it this way is that it's the way the formatting is stored internally in Word. It makes for a very simple structure, and there is a very short list of areas you need to look to understand what formatting is applied to any given location. There are only the run properties and the paragraph properties. There are no other nested structures that can apply formatting.&lt;br&gt;&lt;br&gt;If you are interested, I can definitely go into more details on this.&lt;br&gt;&lt;br&gt;-Brian&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=479763" width="1" height="1"&gt;</description></item><item><title>why put the formatting there?</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#479668</link><pubDate>Tue, 11 Oct 2005 20:05:24 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:479668</guid><dc:creator>The Monster</dc:creator><description>Why in the heck is the formatting going into a sister element to the text element, instead of the nested &amp;amp;lt;b&amp;amp;gt; make this bold &amp;amp;lt;/b&amp;amp;gt;  construct we're all familiar with from (x)html (and for old-school types, in Word Perfect codes that make so much more sense)?  This would seem to force nested formatting information to be repeated for each 'run', unnecessarily bloating the file with the same attributes over and over.&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=479668" width="1" height="1"&gt;</description></item><item><title>re: Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#451950</link><pubDate>Tue, 16 Aug 2005 03:35:39 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:451950</guid><dc:creator>I'm Brian Jones</dc:creator><description>Hey Marcin, I haven't used Polar's add-in, so I'm not sure why those runs are broken out like that. &lt;br&gt;To clean that up, you could use an XSLT, but I'm not sure how difficult it would be. You would need to continue to look ahead and do a compare for each run and grab all the text from future runs that match. I think it would actually be really easy using XPath 2.0, but you'd have to find a tool that supported that.&lt;br&gt;&lt;br&gt;For Word 2003, we try to optimize our output so that runs aren't broken out like that. I think we still miss some cases, but for the most part the output should be closer to your second example.&lt;br&gt;&lt;br&gt;-Brian&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=451950" width="1" height="1"&gt;</description></item><item><title>Prepare for Translation</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#451250</link><pubDate>Sat, 13 Aug 2005 16:23:49 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:451250</guid><dc:creator>Marcin Milkowski</dc:creator><description>Hi Brian,&lt;br&gt;&lt;br&gt;I'm playing with the new format and the examples you gave because I want to prepare a filter which would allow me to go from WordML to XLIFF and back, possibly using a relatively small XSLT.&lt;br&gt;&lt;br&gt;The problem is that with the Polar Word2000 AddIn I get a nasty XML, cluttered with duplicate formatting tags:&lt;br&gt;&lt;br&gt;&amp;lt;w:r&amp;gt;&amp;lt;w:rPr&amp;gt;&amp;lt;w:sz w:val=&amp;quot;27&amp;quot;/&amp;gt;&amp;lt;/w:rPr&amp;gt;&lt;br&gt;&amp;lt;w:t&amp;gt;With&amp;lt;/w:t&amp;gt;&amp;lt;/w:r&amp;gt;&lt;br&gt;&amp;lt;w:r&amp;gt;&amp;lt;w:rPr&amp;gt;&amp;lt;w:sz w:val=&amp;quot;27&amp;quot;/&amp;gt;&amp;lt;/w:rPr&amp;gt;&lt;br&gt;&amp;lt;w:t&amp;gt; &amp;lt;/w:t&amp;gt;&amp;lt;/w:r&amp;gt;&lt;br&gt;&amp;lt;w:r&amp;gt;&amp;lt;w:rPr&amp;gt;&amp;lt;w:sz w:val=&amp;quot;27&amp;quot;/&amp;gt;&amp;lt;/w:rPr&amp;gt;&lt;br&gt;&amp;lt;w:t&amp;gt;your&amp;lt;/w:t&amp;gt;&amp;lt;/w:r&amp;gt;&lt;br&gt;&lt;br&gt;I understand that this could be reformatted (without loosing the formatting) to:&lt;br&gt;&lt;br&gt;&amp;lt;w:r&amp;gt;&amp;lt;w:rPr&amp;gt;&amp;lt;w:sz w:val=&amp;quot;27&amp;quot;/&amp;gt;&amp;lt;/w:rPr&amp;gt;&lt;br&gt;&amp;lt;w:t&amp;gt;With your&amp;lt;/w:t&amp;gt;&amp;lt;/w:r&amp;gt;&lt;br&gt;&lt;br&gt;This transformation would allow me to simply extract the contents of w:t with same formatting (same contents of w:r) because this would greatly reduce the number of tags inside XLIFF file.&lt;br&gt;&lt;br&gt;Now, my questions:&lt;br&gt;&lt;br&gt;1. Is it easy to do such cleaning (reformatting) using XSLT? I tried different methods but I couldn't make it.&lt;br&gt;&lt;br&gt;2. Is Word 2003 and Office ML likely produce such output as Polar WordML addon? Maybe I'm looking for a solution for an artificial problem...&lt;br&gt;&lt;br&gt;Thanks,&lt;br&gt;Marcin&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=451250" width="1" height="1"&gt;</description></item><item><title>re: Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#448981</link><pubDate>Mon, 08 Aug 2005 17:17:39 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:448981</guid><dc:creator>I'm Brian Jones</dc:creator><description>Hi Collin, yes, the snippet is correct. Don't think of it as the &amp;quot;bold tag applying to the text&amp;quot;. Instead think of it as the bold tag is a property of the text run.&lt;br&gt;This is how formatting works in Word. The document is a collection of text runs contained within paragraphs. Formatting can either be inherited from the paragraph, or can be applied directly. When formatting is applied directly to text, that will cause the selected text to be broken out into it's own run.&lt;br&gt;&lt;br&gt;The &amp;lt;w:r&amp;gt; is the actual &amp;quot;object&amp;quot;, where the &amp;lt;w:rPr&amp;gt; stores the properties and the &amp;lt;w:t&amp;gt; stores the text for that run.&lt;br&gt;&lt;br&gt;Does that make sense?&lt;br&gt;&lt;br&gt;-Brian&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=448981" width="1" height="1"&gt;</description></item><item><title>re: Intro to Word XML Part 1: Simple Word Document</title><link>http://blogs.msdn.com/b/brian_jones/archive/2005/07/05/intro-to-word-xml-part-1-simple-word-document.aspx#448404</link><pubDate>Sat, 06 Aug 2005 04:03:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:448404</guid><dc:creator>Colin Fox</dc:creator><description>Is this snippet correct? If so, then I don't see how the bold tag will apply to the hello world text, since it is in a nested rPr scope prior to the Hello World:&lt;br&gt;&lt;br&gt;&amp;lt;w:rPr&amp;gt;&lt;br&gt;   &amp;lt;w:b/&amp;gt;&lt;br&gt;&amp;lt;/w:rPr&amp;gt;&lt;br&gt;&amp;lt;w:t&amp;gt;Hello World&amp;lt;/w:t&amp;gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;SHouldn't it be:&lt;br&gt;&amp;lt;w:rPr&amp;gt;&lt;br&gt;   &amp;lt;w:b&amp;gt;&amp;lt;w:t&amp;gt;Hello World&amp;lt;/w:t&amp;gt;&amp;lt;/w:b&amp;gt;&lt;br&gt;&amp;lt;/w:rPr&amp;gt;&lt;br&gt;?&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=448404" width="1" height="1"&gt;</description></item></channel></rss>