Stephen was telling me about this article that IBM had written that demonstrated:
"consuming and repurposing MS Office 2007 documents is easy with IBM DB2 pureXML features. There really is not much code involved which is conducive to great performance."
It sounded pretty cool, as it provided an example .docx file as well as some PHP examples on doing things like repurposing a .docx into a simple HTML paragraph. The article isn't live anymore (looks like it's undergoing some construction), and I think that's because there was some confusion on what is an XPS document and what an OpenXML document is. The original article (google cache here) describes how to parse on OpenXML .docx file, but it refers to it as an XPS document. I think that's why it's no longer live and under construction. Hopefully we'll see it back up there soon!
The article was great in that it hit on some of the other benefits of OpenXML that people haven't been talking about as much. Most folks in the standards discussions are focused on how different Office productivity tools (such as MS Word or OpenOffice) can create and consume the files. The other really valuable piece of an open format though is that you can use these documents in new ways you hadn't thought of before. This article from IBM hits on how you can consume the OpenXML document and store pieces of it in a database. Or you could also go the other way around, which starts to hit on the various document assembly solutions you could build. Very cool stuff.