Clinick's Clinic

Ramblings on iHD, HD DVD, gadgets in general

What is the document?

Historically Office documents have been self contained and required storage on disk.  This worked well in the unconnected Office world and still has a place for personal productivity in the .NET world but increasingly documents are becoming much more about the presentation of data in a rich way.  The onset of XML as the storage mechanism for Office .NET makes the data/presentation paradigm even stronger since any application can introspect on an Office documents data without having to go via the Office object model to do so.  The move to XML opens up considerable capabilities for Office .NET but it does mean that we have to rethink how documents are stored and viewed.  For example, when you go to the internal budget site . http://budget, it provides you with a page that contains all your budgets.  To the end user it looks like the server has a page with their data stored somewhere on the server but in reality the document is just a result of a query on the database with some presentation logic included.  Imagine if the budget site provided you with an Excel spreadsheet containing all your budgets.  You’d get to it by making an HTTP request which would include your username, the server logic would query the relevant database with your user id and then return the Excel content.  The Excel document would never exist other than in the server logic that knew how to construct the spreadsheet.  The document only ever exists when you ask for it, there’s never a copy on the server.  It’s kind of like the Matrix – you think it’s there because you can see it, interact with it but in reality it’s just a creation of the server to fool you into thinking it has a document for you.

 

What does this all mean?

By freeing Office from having to have physical files on disk it becomes much more flexible, more attuned to the needs of the solution rather than the solution having to bend to meet Office requirements.  For example:  Imagine a customer tracking system for a life assurance company.  The sales rep visits you at home and convinces you that he has the best life assurance policy for you.  You agree and sign up with him.  The sign up process takes some information about you (social sec number, address etc) in the sales force automation application on the sales reps laptop.  The data gets entered into a local SQL database ready for replication when the rep connects back to the life assurance extranet that evening (Of course in the brave new world he’d be connected via his/her 3G cell connection and it would be immediate but hey rome wasn’t built in a day!).  When the data is received by companies SQL server it creates a work item for customer services to make sure that the data is OK and that the rep is following the rules.  This is mostly an automatic process done by the business application but some human input is required to double check potential anomalies (the data could have been input incorrectly, the sales rep could make this stuff up etc)  Assuming the data is OK the system needs to print out a welcome letter to the customer and send them an email welcoming them to the life assurance company.  To do this the system creates a new Word document, using the customer XSD and a transformation to get the cool layout features provided by Word, and prints it out. 

 

Everything is going fine, the customer is happy for the first year etc.  Seeing a competitors commercial on TV offering better rates the customer phones up the life assurance company and speaks to a customer services rep.  The rep manages to deflect the competitors advantage and manages to upsell the customer onto a better life assurance policy.  As a result they want to create a letter thanking them for their business and provide some more info on the benefits provided by the new policy.  The customer services rep clicks on the generate letter hyperlink/button in their customer service app.  The customer services application creates a Word file using the standard new business template based on the customer and policy XSD’s.  The customer services rep gets a nicely formatted Word document on their screen, they add some welcoming text and print out the document.  Once the document is printed the document can be saved back to the server.  The save generates an HTTP request back to the web server which has custom code handling saves which actually takes the content of the document (which is structured now) and stores the info in the SQL database, the Word document is never stored on the server.  If a user wants to see the correspondence with the customer then they go to the correspondence page of the customers services application and that provides a list of all the docs that have been sent to the customer.  When you click on one of the documents it is recreated from the database and transformed into Word

 

Published Wednesday, May 26, 2004 10:09 PM by andrewclinick

Comments

 

Charlie said:

I think this is a pretty fascinating concept. I've worked as a developer in the legal industry for years, and have for some time been struck by how their most valuable data (basically the thousands upon thousands of Word documents they create) are stored in a very primitive way (i.e. binary files).

I've seen some movement toward systems that try to get past the file paradigm and store parts of documents (i.e. clauses) - but still the big document management systems are completely based on storing documents at the file level.

From what I've seen in the "clause" efforts to date, the need for the data to end up as a Word file has been a technical problem - formatting issues are really a big deal in a law firm for a lot of reasons. So you end up in a situation where, while the data may be in a database, it somehow has to get into a highly formatted Word document (like a pleading in outline form with line numbers). We considered at one point trying to write a server side component that could transform data into the Word HTML format - but the complexity was very high.

The Yukon SQL XML data type, Office Smart documents, web services - all seem to suggest that maybe the big shift may happen where content is truly separated from presentation, and so becomes much more reusable and discoverable.

Another aspect is the whole "knowledge management" thing. The legal business has gone through a whole cycle of search technologies (something the big DMS's also provide). But the general feeling is that this hasn't been too successful - usually because the meta data just isn't rich enough, and key word searches return far too many hits, and the corpus is full of drafts, outdated materials, and just junk.

But - if documents are broken down and stored in more granular parts, and also associated with more sophisticated meta data and - maybe more importantly - the content authors understanding of what their doing when creating a "document" evolves(perhaps they start to think like developers and see document clauses as reusable components), retrieval and repurposing could also become much, much better.

Anyway, I could go on and on, but I'm climbing of my soap box now :) Very cool concept though. I really think someday network shares containing thousands and thousands of files will be a thing of the past...

charlie


May 27, 2004 9:09 AM
Anonymous comments are disabled

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker