<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>ex Scientia : Publishing Workflow</title><link>http://blogs.msdn.com/exscientia/archive/tags/Publishing+Workflow/default.aspx</link><description>Tags: Publishing Workflow</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Peer Review in the Age of Software as a Service</title><link>http://blogs.msdn.com/exscientia/archive/2008/11/03/peer-review-in-the-age-of-software-as-a-service.aspx</link><pubDate>Mon, 03 Nov 2008 09:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9031556</guid><dc:creator>pablofe</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/exscientia/comments/9031556.aspx</comments><wfw:commentRss>http://blogs.msdn.com/exscientia/commentrss.aspx?PostID=9031556</wfw:commentRss><description>&lt;P&gt;The &lt;A href="http://journal.mssandbox.net/" mce_href="http://journal.mssandbox.net"&gt;Microsoft eJournal Service&lt;/A&gt; is a good example of a growing trend towards delivering functionality through the Software as a Service approach. &lt;/P&gt;
&lt;P&gt;This web based service enables scientists and researchers to more easily engage in the collaborative process that is the foundation of Scholarly Publishing. The service aims at lowering the technical and financial barriers involved in getting a publication up and running, by removing the need for purchasing and maintaining servers, as well as installing and updating software packages. While some aspects of publishing remain the same, such as producing good research, capturing the results in an article, finding experts to review the article, and polishing the article for publishing, the goal is that the service will make the publishing process more accessible and available to a larger number of scientists. &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Roles and Workflows &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The service is based on three key roles: Editors, Reviewers, and Authors. Through the service, Editors can gather and review submissions from Authors, and coordinate the review process with the Reviewers. At the end of the review process, approved articles can be posted to the journal site and/or submitted to repositories, or even passed to other services. &lt;/P&gt;
&lt;P&gt;Underlying the service is a set of workflows, which guide the different participants through the process and help manage the tasks and deadlines. These workflows support the core interactions which underlie the review process, with some options available to configure the workflow. &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Format Independence and Browser Neutrality &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;The service does not impose the use of a particular file format, the Editor can restrict submissions to only certain file types if desired, and should be accessible through any web browser. In selecting a file format, I would advise migrating to XML based formats, such as OpenXML, which can more easily capture semantics, metadata, and relationship between content and data, and are more conducive to computer processing for search and semantic analysis. &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Repositories &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;At the end of the process, the Editor can configure the service to deposit the articles to different repositories. One of those repositories is &lt;A href="http://arxiv.org/" mce_href="http://arxiv.org/"&gt;ArXiv&lt;/A&gt;, which is very popular for Physics and Math content, and can now be accessed using the SWORD protocol.&lt;/P&gt;
&lt;P&gt;The service can also be used to deposit to other SWORD based archives. This functionality would also be useful for depositing into institutional repositories, and as such, the service could be used to manage the review process for publications such as thesis. &lt;/P&gt;
&lt;P&gt;In order to deposit to a repository, you will need a login name and password on the system. The repository may have requirements as to the file formats supported, and their packaging, which you will need to match before submitting. &lt;/P&gt;
&lt;P&gt;For folks in BioMed, you can also select to deposit into PubMed Central, and, as noted before, you need to be approved for deposit ahead of time, and have access to the system. &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Participate! &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;As with the Authoring Add-in, we welcome your participation and feedback. We would love to hear from you in relation to what we can offer to make you more productive and, hopefully, make the technology disappear in the background, freeing you to focus on the task at hand and simplifying the process. Give the service the service a try at &lt;A href="http://journal.mssandbox.net/" mce_href="http://journal.mssandbox.net/"&gt;http://journal.mssandbox.net/&lt;/A&gt;. &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Other Interesting Services &lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Two very interesting recently introduced services are &lt;A href="http://workspace.officelive.com/" mce_href="http://workspace.officelive.com/"&gt;Office Live Workspaces&lt;/A&gt; and &lt;A href="http://smallbusiness.officelive.com/" mce_href="http://smallbusiness.officelive.com/"&gt;Office Live Small Business&lt;/A&gt;. Office Live Small Business is a great example of making online presence and collaboration more accessible through a subscription model, along the lines of what we are trying to achieve with the eJournal Service. For those that are interested in the technical details, underlying Office Live Small Business is Microsoft SharePoint. &lt;/P&gt;
&lt;P&gt;Squarely on the business front are two new software-as-a-service offerings, &lt;A href="http://www.microsoft.com/online/sharepoint-online.mspx" mce_href="http://www.microsoft.com/online/sharepoint-online.mspx"&gt;hosted SharePoint&lt;/A&gt; and &lt;A href="http://www.microsoft.com/online/exchange-online.mspx" mce_href="http://www.microsoft.com/online/exchange-online.mspx"&gt;hosted Exchange&lt;/A&gt;. Besides being useful to small, medium, and even large business, both of these services should be of useful to universities, colleges, and research institutions.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9031556" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/exscientia/archive/tags/Publishing+Workflow/default.aspx">Publishing Workflow</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/SWORD/default.aspx">SWORD</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/Microsoft+eJournal+Service/default.aspx">Microsoft eJournal Service</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/ArXiv/default.aspx">ArXiv</category></item><item><title>Release Candidate for Article Authoring Add-in Now Available for Download</title><link>http://blogs.msdn.com/exscientia/archive/2008/07/28/release-candidate-for-article-authoring-add-in.aspx</link><pubDate>Mon, 28 Jul 2008 09:54:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8782963</guid><dc:creator>pablofe</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/exscientia/comments/8782963.aspx</comments><wfw:commentRss>http://blogs.msdn.com/exscientia/commentrss.aspx?PostID=8782963</wfw:commentRss><description>&lt;P&gt;We are happy to announce that this morning we &lt;A class="" title=posted href="http://www.microsoft.com/downloads/details.aspx?FamilyID=09C55527-0759-4D6D-AE02-51E90131997E&amp;amp;displaylang=en" target=_blank mce_href="http://www.microsoft.com/downloads/details.aspx?FamilyID=09C55527-0759-4D6D-AE02-51E90131997E&amp;amp;displaylang=en"&gt;posted&lt;/A&gt; the Release Candidate build of the Article Authoring Add-in.&lt;/P&gt;
&lt;P&gt;Over the past couple of months, the community has provided very useful feedback based on the Beta&amp;nbsp;1 release.&amp;nbsp; We feel that we have refined the overall experience and addressed the key elements of the feedback we have received in this Release Candidate of the add-in.&amp;nbsp; &lt;U&gt;Thank you&lt;/U&gt; for your engagement and support.&lt;/P&gt;
&lt;P&gt;Starting with a simplified install experience, this latest release has a number of improvements under the covers, from enhancing the XML that is generated to improvements in the user interaction, especially for the Journal Panel.&amp;nbsp; We encourage you to download this new build and evaluate it as part of your workflow.&amp;nbsp; As you think of using the different&amp;nbsp;functionality provided by the add-in, please send us your comments and requests for future releases.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Let's do a quick recap of what the add-in provides:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Open/Save files into the National Library of Medicine XML format&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;XML documents in the NLM format can be opened from within Word, edited, and saved, both as Word files and back again as XML.&amp;nbsp; The add-in also&amp;nbsp;includes support for the NLM book format.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Access to Metadata from within the Word user interface&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Author, article, and journal metadata is accessible through the user interface exposed by the add-in, enabling the editing of all information that is part of the NLM format.&amp;nbsp; Software developers can also write tools and applications to create or access this data programmatically, for example connecting the data in a document to a database.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Incorporating NLM semantic elements within the Word document&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Starting with Sections, semantic elements appear explicitly within the document, and enable authoring in a more structured manner, better preparing the document contents for analysis, validation, and search.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Ability to create and use templates&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;The add-in installs a set of example templates: a blank article template, a blank book chapter template, and a sample article template with keywords and sections.&amp;nbsp; The blank articles are particularly useful for starting new articles, or for providing structure to content pasted in from another document.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;We feel that the add-in supports the evolution to the greater use of XML as the underlying format for archiving articles.&amp;nbsp; Specially as part of the transition to electronic-first or electronic only publishing, the add-in should prove useful in generating XML content, without having first to take articles through the traditional print oriented and page layout based processes.&amp;nbsp; The resulting XML content can then be transformed for presentation, making use of the semantic information in the document to determine presentation parameters.&lt;/P&gt;
&lt;P mce_keep="true"&gt;In addition, the add-in should be particular useful to journals/publishers in the biomedical fields, where many articles are now required to be submitted to PubMed Central for archival.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8782963" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/exscientia/archive/tags/NLM/default.aspx">NLM</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/Publishing+Workflow/default.aspx">Publishing Workflow</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/XML/default.aspx">XML</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/Release+Candidate/default.aspx">Release Candidate</category></item><item><title>Feedback on your usage model</title><link>http://blogs.msdn.com/exscientia/archive/2008/07/17/feedback-on-your-usage-model.aspx</link><pubDate>Thu, 17 Jul 2008 23:03:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8744907</guid><dc:creator>pablofe</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/exscientia/comments/8744907.aspx</comments><wfw:commentRss>http://blogs.msdn.com/exscientia/commentrss.aspx?PostID=8744907</wfw:commentRss><description>&lt;P&gt;Participation in the Beta program so far has been fantastic and we have been able to incorporate most of the feedback and requests that you have sent in so far.&amp;nbsp; It is evident that this community is very hands-on and&amp;nbsp;enthusiastic about the authoring and archiving process.&lt;/P&gt;
&lt;P&gt;We have been able to&amp;nbsp;engage in a good dialog with several community members, with folks submitting their sample documents for testing and scenarios as input.&amp;nbsp; While&amp;nbsp;a lot of people have downloaded the add-in, not everyone has contacted us with comments and feedback yet, so I wanted to open up the dialog and solicit your input.&amp;nbsp; &lt;/P&gt;
&lt;P&gt;As expected, the majority of the early adopters are part of the staff at journals, repositories, libraries, and also companies that support the publishing workflow.&amp;nbsp; There are&amp;nbsp;a few enthusiasts/early adopters,&amp;nbsp;as well as folks interested in the writing process and on capturing semantics, who have also tried out the add-in and sent their feedback.&amp;nbsp; Many thanks to all that have contacted us!&lt;/P&gt;
&lt;P&gt;To help kickstart the broader dialog, here are some questions as to how you are using, or planning to use, the add-in as part of your workflow:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Do you start using the add-in by importing an xml file, or by pasting in content from another document?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Do you use the add-in for editing content, metadata, or both?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Do you introduce new sections into documents?&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Do you tend to use custom sections?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Have you created templates, with sections that you use often, for re-use?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Have you used the keyword or subject area panel (on the right side of the window)?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Are there elements in the NLM format which are not currently supported in the add-in which are essential for your workflow?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV class=MsoNormal style="MARGIN: 0in 0in 10pt"&gt;&lt;FONT face=Calibri size=3&gt;Do you plan to access the content or metadata in the file directly, without using Word (for example using your own internal tools)?&lt;/FONT&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;If there is anything else you want to comment on in relation to the add-in, feel free to send it in.&amp;nbsp; You can post your answers as comments on the blog or send them over email.&amp;nbsp; This is a great opportunity to engage in the development of the add-in and help shape the experience for authoring scientific and technical articles.&lt;/P&gt;
&lt;P&gt;And, once more, many thanks for your participation and input.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8744907" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/exscientia/archive/tags/Publishing+Workflow/default.aspx">Publishing Workflow</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/Beta+1/default.aspx">Beta 1</category></item><item><title>The Power of Structured Content</title><link>http://blogs.msdn.com/exscientia/archive/2008/05/13/the-power-of-structured-content.aspx</link><pubDate>Wed, 14 May 2008 01:07:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8502157</guid><dc:creator>pablofe</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/exscientia/comments/8502157.aspx</comments><wfw:commentRss>http://blogs.msdn.com/exscientia/commentrss.aspx?PostID=8502157</wfw:commentRss><description>&lt;P&gt;As we get comments and questions from you about the add-in, or have opportunities to discuss the experience and underlying technology face-to-face (as at the HighWire Publisher's Meeting last week), it is interesting how frequently the topic of structured content comes up.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Looking Back and Looking Forward&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Although there is more awareness, visibility,&amp;nbsp;and articles on the topic of legacy content (book scanning), it is important, and perhaps even more so, to focus on how &lt;EM&gt;new&lt;/EM&gt; content is being created.&amp;nbsp; As the consumption of content makes the transition from print to digital, and search becomes one of the key ways we come across new content (journals and conferences being the other traditional ways), it is important that we evolve the process by which new content is created, in order to be able to fully exploit the benefits of the new digital medium.&lt;/P&gt;
&lt;P mce_keep="true"&gt;The way articles are commonly authored today is still largely focused on print as the end point, in that there is still a larger focus on presentation over semantics.&amp;nbsp; The semantics in many workflows are added after the article is approved, somewhere along the publishing pipeline, by people other than the article authors.&amp;nbsp;We need to enable authors to add&amp;nbsp;semantics as part of the authoring process, and have that content preserved through the publishing workflow.&amp;nbsp;&amp;nbsp;The best technology available to preserve semantic content today is to&amp;nbsp;express the article in XML.&amp;nbsp; Archiving articles as plain text (either HTML or PDF) results in a loss of information.&amp;nbsp; And, this loss of information is not only detrimental to effective search today, but will also be detrimental to other types of semantic analysis in the future.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Lossy Workflows&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;In some print focused workflows, even if semantic elements were present in the original file created by the author, the semantics&amp;nbsp;are lost in the digital version, as a result of the process leading to print.&amp;nbsp;&amp;nbsp;At the end of this type of&amp;nbsp;workflows, a PDF file is generated, reflecting the print layout.&amp;nbsp; Any semantic information&amp;nbsp;in the original article is lost in the resulting PDF.&amp;nbsp; The&amp;nbsp;PDF file,&amp;nbsp;in turn, is&amp;nbsp;used to generate an XML file, which is the basis for the digital content (and usually has to be sent out for tagging).&amp;nbsp; These types of workflows not only result in a loss of semantic information, but are inefficient from the point of view of creating digital content.&amp;nbsp; As journals move to be digital first (or digital only) &lt;U&gt;editors will need to pay close&amp;nbsp;attention to how their workflows are structured and how data is preserved/converted&lt;/U&gt;.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Presentation vs Semantics&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Ideally, if we do a good job at capturing semantics during authoring, we can ignore the final presentation throughout the publishing workflow.&amp;nbsp; Layout and other presentation elements (margins, font family, color, size, etc) can be applied to the archive version of the document for viewing (for example, during the generation of the HTML files).&amp;nbsp; Of course, we do not want authors dealing with raw XML editing.&amp;nbsp; Having a&amp;nbsp;nice presentation within the word processor helps with authoring, but, it does not need to be the &lt;EM&gt;final presentation&lt;/EM&gt;, instead it can be&amp;nbsp;the presentation that best works for the author.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Currently some Word article templates rely on Styles to identify semantics.&amp;nbsp; This is the best that could be done with the capabilities of previous versions of Word, but it is fragile as it mixes presentation and semantics.&amp;nbsp; To make matters worse, it is easy for styles to "leak" through copy and paste, or editing, invalidating the semantics.&amp;nbsp; Word 2007's use of XML as its native format, its ability to have custom XML elements, and the extensibility of Word's user interface and file packaging, enable a more robust way of entering semantic elements during authoring, preserving metadata, and enabling conversion to other formats.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Capturing Semantics and Authors' Insights&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Authors will be largest class of users of the add-in, so we focus a lot on the experience that is presented to this audience.&amp;nbsp; Authors likely will have no idea of the format used to back their articles (whether OpenXML or NLM), nor &lt;U&gt;should they care&lt;/U&gt;.&amp;nbsp; Also, the richness and complexity of the metadata expressible in the NLM format does not need to be exposed to them in a raw form (but needs to be accessible to the journal/archival staff - I will cover this in a future posting).&lt;/P&gt;
&lt;P mce_keep="true"&gt;What is the benefit to authors from capturing semantics and metadata?&amp;nbsp; As semantic search evolves, articles with more/better semantic data should become more relevant in search results than articles without this information.&amp;nbsp; Thus, as content moves to be consumed primarily in digital form, articles with better semantics stand a better chance of being found, read, and cited.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Along the concept of the Dublin Core (or the core of the Core), we are focusing on enabling journals to capture a set of data from authors:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Sections (title, abstract, etc)&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Authors information&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Keywords and subjects&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;Author notes&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P mce_keep="true"&gt;While it would be interesting to capture additional information within the content during authoring, it is important not to overburden authors.&amp;nbsp;&amp;nbsp;At least if we manage to get this small set of data reliably, and reduce entry errors, then we will provide a good baseline for metadata in articles.&amp;nbsp; Over time, as authors become comfortable with the concept, and see benefits, the baseline can be moved up (but again, the user interaction needs to be simple and, as much as possible, unobtrusive).&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Additional Reading&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Peter Murray-Rust has a couple of threads with related topics, on&amp;nbsp;&lt;A class="" title="semantics and chemistry" href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1065" target=_blank mce_href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1065"&gt;semantics and chemistry&lt;/A&gt; and &lt;A class="" title="structured content and PDF" href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1069" target=_blank mce_href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1069"&gt;structured content and PDF&lt;/A&gt;.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8502157" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/exscientia/archive/tags/NLM/default.aspx">NLM</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/OpenXML/default.aspx">OpenXML</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/Publishing+Workflow/default.aspx">Publishing Workflow</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/XML/default.aspx">XML</category></item><item><title>Publishing Workflow – Math content as paths vs glyphs in generated PDF files</title><link>http://blogs.msdn.com/exscientia/archive/2008/04/02/publishing-workflow-when-generating-a-pdf-file-from-word-math-content-may-result-in-paths-instead-of-fonts.aspx</link><pubDate>Thu, 03 Apr 2008 01:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:8352598</guid><dc:creator>pablofe</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/exscientia/comments/8352598.aspx</comments><wfw:commentRss>http://blogs.msdn.com/exscientia/commentrss.aspx?PostID=8352598</wfw:commentRss><description>&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Recently I was involved in diagnosing an issue where, when&amp;nbsp;a PDF file was generated from Word 2007, the Math content from the Word document was being converted to paths, instead of being represented by glyphs from the &lt;?xml:namespace prefix = st1 ns = "urn:schemas-microsoft-com:office:smarttags" /&gt;&lt;st1:place w:st="on"&gt;Cambria&lt;/st1:place&gt; Math font.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Note that you can download a free&amp;nbsp;add-in to generate PDF files from Word 2007 from &lt;A title=here href="http://www.microsoft.com/downloads/details.aspx?FamilyID=4d951911-3e7e-4ae6-b059-a2e79ed87041&amp;amp;DisplayLang=en" target=_blank mce_href="http://www.microsoft.com/downloads/details.aspx?FamilyID=4d951911-3e7e-4ae6-b059-a2e79ed87041&amp;amp;DisplayLang=en"&gt;here&lt;/A&gt;.&amp;nbsp; Also, Word 2007 has quite a bit of new Math &lt;A href="http://blogs.msdn.com/murrays/default.aspx"&gt;functionality&lt;/A&gt;, and a beautiful font to go along with it (Cambria Math).&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;When the goal is to generate high quality content, whether Math content&amp;nbsp;is represented as paths or glyphs makes a difference.&amp;nbsp; Note that this is not something that a casual observer would necessarily notice, as seen in these screen shots at 100% magnification.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;The first screenshot is from the content in Word.&amp;nbsp; The second is of the generated PDF file with the content as paths.&amp;nbsp; The third image is of the generated PDF file with the content as glyphs.&amp;nbsp; There is very little difference in the three screen shots below (at least to me).&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&amp;nbsp;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;?xml:namespace prefix = v ns = "urn:schemas-microsoft-com:vml" /&gt;&lt;v:shapetype id=_x0000_t75 stroked="f" filled="f" path="m@4@5l@4@11@9@11@9@5xe" o:preferrelative="t" o:spt="75" coordsize="21600,21600"&gt; &lt;IMG title="Original content in Word" style="WIDTH: 413px; HEIGHT: 90px" height=90 alt="Original content in Word" src="http://www.fernicola.org/loquitor/uploads/equationinword.jpg" width=413 align=middle mce_src="http://www.fernicola.org/loquitor/uploads/equationinword.jpg"&gt;&lt;v:stroke joinstyle="miter"&gt;&lt;/v:stroke&gt;&lt;v:formulas&gt;&lt;v:f eqn="if lineDrawn pixelLineWidth 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 1 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum 0 0 @1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @2 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @3 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @0 0 1"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @6 1 2"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelWidth"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @8 21600 0"&gt;&lt;/v:f&gt;&lt;v:f eqn="prod @7 21600 pixelHeight"&gt;&lt;/v:f&gt;&lt;v:f eqn="sum @10 21600 0"&gt;&lt;/v:f&gt;&lt;/v:formulas&gt;&lt;v:path o:connecttype="rect" gradientshapeok="t" o:extrusionok="f"&gt;&lt;/v:path&gt;&lt;o:lock aspectratio="t" v:ext="edit"&gt;&lt;/o:lock&gt;&lt;/v:shapetype&gt;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Original content in Word&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;IMG title="Path based content at 100%" style="WIDTH: 338px; HEIGHT: 66px" height=66 alt="Path based content at 100%" src="http://www.fernicola.org/loquitor/uploads/PDFpaths100percent.jpg" width=338 align=middle mce_src="http://www.fernicola.org/loquitor/uploads/PDFpaths100percent.jpg"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'; mso-no-proof: yes"&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Path based content in Adobe’s PDF viewer (100% zoom)&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;SPAN style="mso-spacerun: yes"&gt;&lt;IMG title="Glyph based content at 100%" style="WIDTH: 328px; HEIGHT: 90px" height=90 alt="Glyph based content at 100%" src="http://www.fernicola.org/loquitor/uploads/PDFglyphs100percent.jpg" width=328 align=middle mce_src="http://www.fernicola.org/loquitor/uploads/PDFglyphs100percent.jpg"&gt;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Glyph based content in Adobe’s PDF viewer (100% zoom)&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;However, when zooming in at 600%, it is possible to start noticing that, in the case where paths were used, the curves have discrete line segments, whereas the glyph version continues to be smooth.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&amp;nbsp;&lt;IMG title="Path based content at 600%" style="WIDTH: 450px; HEIGHT: 154px" height=154 alt="Path based content at 600%" src="http://www.fernicola.org/loquitor/uploads/paths600percent.jpg" width=450 align=middle mce_src="http://www.fernicola.org/loquitor/uploads/paths600percent.jpg"&gt;&lt;SPAN style="mso-no-proof: yes"&gt;&lt;/SPAN&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Path based content at 600%magnification, note the aliasing on the curved segments.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;IMG title="Glyph based content at 600%" style="WIDTH: 452px; HEIGHT: 131px" height=131 alt="Glyph based content at 600%" src="http://www.fernicola.org/loquitor/uploads/glyph600percent.jpg" width=452 align=middle mce_src="http://www.fernicola.org/loquitor/uploads/glyph600percent.jpg"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'; mso-no-proof: yes"&gt;&lt;/SPAN&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Glyph based content at 600% magnification, perfect!&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;Initially we could not reproduce the problem in our environment over here (let me tell you how much I hate it when we cannot reproduce bugs).&amp;nbsp; In talking to the folks that reported the problem, we verified that the fonts were correctly&amp;nbsp;installed and the font file versions were as expected.&amp;nbsp; Folks in the Word team then tracked down under which conditions paths would get generated, and we also found out that the original problem was being seen on a Windows Server 2003 installation, not on a client configuration.&amp;nbsp; &lt;EM&gt;Note that this is not an issue that one would run into with Windows Vista, because it has a different default configuration.&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;From there it was straightforward to verify and solve the problem.&amp;nbsp; When exporting content, Word checks whether Complex Scripts or &lt;st1:place w:st="on"&gt;Far East&lt;/st1:place&gt; scripts are enabled on the machine, to decide whether to generate paths or glyphs for Math content.&amp;nbsp; In case you run into a similar issue, the solution was to enable both scripts on the server (which may require the installation disk and a reboot), through the Languages tab in the Regional Settings control panel.&lt;o:p&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'"&gt;It is nice to have a happy ending to problems, and in this case&amp;nbsp;being able to preserve high quality Math content.&lt;/SPAN&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=8352598" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/exscientia/archive/tags/Publishing+Workflow/default.aspx">Publishing Workflow</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/PDF/default.aspx">PDF</category><category domain="http://blogs.msdn.com/exscientia/archive/tags/Math/default.aspx">Math</category></item></channel></rss>