In this post, I'm going to give out one of my favorite secrets - how to read specifications quickly, with a high degree of retention. I have a particular technique, and if you use this technique, you may read specs more quickly, and you will remember more after you have read them.
This blog is inactive.New blog: EricWhite.com/blogBlog TOCPlease excuse the length of this posting - however, I'm going to bet that if you will take the 10 minutes or so that it will take to read it, you might save this time many times over the next time you sit down to read a specification.
A number of years ago, when I owned my small software business, one of our customers was a big defense contractor. We sold our products and services to a group that was working on a more complicated project. I became friends with one of the engineers there. He was one of those old-time engineers - wore pocket protectors - wrote in Fortran when he needed to write code, etc. Great guy. In one of our conversations, he told me about the voluminous specs for the project. They didn't measure the specs in thousands of pages; they measured it in hundreds of thousands of pages. They were using our custom control to write a program to manage all of these specs. And this engineer was familiar with a very big chunk of those specs. I'm such a nerd - he and I talked about specs for hours when we went out for a meal after work. Actually, if you want to find out how big of a nerd I am, just ask my wife :-)
Have you ever asked someone who has a PhD or MD about how much material they had to read in the process of getting their degree? One of my close friends is an MD. I watched her go through medical school, and saw the amount of material that she had to assimilate. It is cool to ask her about a particular disease. She has a nearly photographic memory, and she will more or less mentally open the book where she read about the disease, tell you that there are 12 symptoms, list them, and if the patient has seven or more, further tests are recommended. Of course, she doesn't count - she is brilliant, and it is unfair to compare the rest of us mere mortals with her intellectual capacity.
I think that the capacity of the human mind is marvelous.
I learned my spec reading technique by accident. It is a technique relatively new to me - I started using it only about 2 1/2 years ago, shortly after I started at Microsoft. The situation was this - I ride the bus to work every day, and have about a 45 minute ride, both morning and night. I wanted to make maximum use of this time, so I started reading technical books during my bus ride. However, my commute is broken up. I walk a few blocks, and then wait for 5-10 minutes for a bus. Then, I ride that bus for about 10 minutes. Then, I get off that bus, and wait for 5-10 minutes for another bus, which then takes about 20 minutes to get to the Microsoft campus. I kept finding that each time I had to look up from my book, get on or off the bus, etc., that I would lose my place on the page. So I started carrying pens along with my reading material. After reading (and comprehending) each sentence, I would put a slash through the period of the sentence. Then, when I next continued reading, it was simple to find my spot. I would start with the sentence after the last marked sentence. And of course, I would write editorial comments in the margins.
Well, the funniest thing happened. I felt like I was reading the technical material faster, and I was retaining more. So then, I started making different marks in the books, depending on the content of the sentence that I just read. If I read the sentence, and my mental response was simply, "got it", I would make my simple slash. If I read a sentence, and the assertion in the sentence was a more important one, I would write a sort of check mark - underline at the end of the sentence, and then an upwards slash. If I read the sentence, and I remembered that I had read this fact previously, but had forgotten it, I wrote a loop. And while reading a sentence, if there were lots of words in the sentence that were important to the meaning of the sentence, I would underline each important word. And I felt like my comprehension went up another notch.
It isn't that I go back and make use of these "hieroglyphics", as my friend Doug Mahugh, calls my marks. I think that my comprehension has gone up simply because I am bringing more consciousness to my technical reading.
Of course, I was destroying the book or printed spec, but my time is more important than a few dollars for a book, so I felt that ruining the book was a small price to pay for getting the material from the book or spec into my brain, while using my bus ride effectively. Then, for certain books, after reading the book, I would buy a clean copy for my bookshelf for future reference.
Of course, for reading specs, I would simply print out the specs, and use one of those staplers that can stable 50 pages together. After reading, I would just pitch it into the recycling bin.
I found that I could predict the rate at which I read books and specifications. Certain books were in the ~50 pages per hour range. Some books are in the ~75 pages per hour range. And some dense books on language or compiler theory are in the 20-25 pages per hour range. Whatever. With each type of material, my speed improved, and my comprehension improved.
Well, this brings us to the Open XML Specification, which is in the 75 pages per hour range. It is easy reading, for the most part. There are a few dense sections, but mostly it is material that a competent XML programmer can get through quickly. And there are lots of pictures, illustrations, examples, and non-normative text, which greatly improves the readability. I've had my job as technical evangelist for Open XML for four weeks, and I've already read a significant chunk of the spec. Of course, I tend to be task focused. For instance, I needed to help some developers who need to deal with change tracking, so I read the 150 pages or so that deal with annotations of all types. It took a couple of hours, pleasantly spent in a coffee shop with great south facing windows. (Necessary light therapy when in Seattle in January!)
There are three types of readers of the Open XML spec:
I think that the complaints about the length of the dispositions are interesting. Most of the dispositions have a fair amount of "white space" in them, such as a table that repeats the comment originally made by the member body, the proposed change by the member body, a section listing similar comments by other member bodies, and the like. And every disposition is started on a new page, so the last page is nearly blank in many cases. The dispositions are more in the 100 page per hour range, for me. The length of the dispositions is simply a indication of the seriousness that the Ecma TC45 technical committee took the comments.
So from a common sense point of view, here is what I see:
The Open XML specification is a few thousand pages long. ISO ratification of this specification is important to the productivity of the entire world. Let's face it, there are a huge number of documents that are stored in a binary format that is convertible to Open XML with high fidelity. The passing of this specification means that developers and system designers will be able to rely on the format of documents. They will build innovative solutions (both open source and commercial) that will literally empower the productivity of the world for years to come.
In terms of spec size to benefits, this is a bargain.
PingBack from http://msdnrss.thecoderblogs.com/2008/01/28/an-approach-to-reading-specifications-quickly/
Open XML and XSLT with XMLSpy. Alex Falk has a great post on how to create a simple XSLT that transforms
"ISO ratification of this specification is important to the productivity of the entire world"
Exaggeration at its best. Up there with Salk and Einstein?
"Leave the assessment of the length of the spec to the people competent to make that assessment."
What is the length of the spec? Page count is a poor measure. Word count is slightly better, but not always indicative.
A close approximation is the Zipped file size - how big is the Zipped spec?
WRT how important the spec is, reasonable people can disagree. It's not possible to get an assessment of the number of binary docs that can be converted to Open XML with high fidelity, but I'm sure that we agree that the number is huge.
About the length of the spec when zipped, I'm sure that you know how to run WinZip ;)
How do a huge number of existing documents that can be converted from one format to another format bear on the importance of the -ratification- of a document that describes one of the formats?
The existance of those documents is a drain on productivity, as they will either a) need time and effort to convert and to manage the conversion and to prevent documents from being re-converted (to avoid anomalies) , or b) require all future applications to maintain the conversion segment for each old format and to each newer one.
Most of the huge number of documents are in the -other- format and the -other- format has no ratified ISO standard. Having the older formats ISO ratified would make a productivity boost, especially if it included a validation suite to ensure applications were generating valid files.
99.999999% of existing MS Binary-format files would have been better off being preserved as PDFs. The need to view and print existing documents far exceeds the need to edit them. Editting documents is often undesireable, such as for legal documents.
Not all Microsoft bloggers know about WinZip or that the zip method used by MS Office 2007 is sub-optimal (~20% larger) WRT WinZip. I'm impressed.
It started an email Mohamed Hossam (AKA, Bashmohandes) sent to my company's local office here in Egypt
In the Office Open XML Overview, in section 2, entitled, “Purposes for the Standard”, the first sentence says: “OpenXML was designed from the start to be capable of faithfully representing the pre-existing corpus of word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft Corporation.”
You may not agree with the stated purpose of the Ecma 376 standard, however Ecma’s standard is in alignment with its purpose.
What we're trying to do here is to provide the most seamless way forward for all these people who have binary documents. Of course, IBM would rather force people to convert to ODF, and then sell consulting services to aleve their pain.
With regards to the desirability of the binary formats over an XML format, I personally think that it is a no-brainer to allow developers to use any of the really good XML programming APIs, instead of munging around in a binary file. The CTO of ThinkFree (www.thinkfree.com) indicated that it was far easier to work with Open XML than the binary formats, and his company has done extensive work with both.
I think that your assertion regarding binary format files would be better preserved as PDF is not correct. It certainly is not what we hear from our customers.
And finally, regarding your comment about WinZip, sarcasm is not the most effective tool to convince people. I'd like to invite you to move the discussion up a level. Let's argue with supported assertions, politely phrased opinions, etc. :)
"What we're trying to do here is to provide the most seamless way forward for all these people who have binary documents."
Even a plane going in a circle around the planet is going 'forward'. Forward is a direction in the context of a destination.
What is the destination that MSO-XML and ECMA(...) were designed for? Re-representing exisiting docs on its own is a useless destination.
"IBM would rather force people to convert to ODF, and then sell consulting services to aleve(sic) their pain."
Wouldn't Microsoft be in a much better position to sell consulting services?
Mine - "MS Binary-format files would have been better off being preserved as PDFs" Yours - "it is a no-brainer to allow developers to use any of the really good XML programming APIs, instead of munging around in a binary file."
Another orthogonal answer. I wrote of preserving documents, you reply about creating documents.
ThinkFree supports saving as PDF. Their Flash presentation has a number of errors as well, some spelling, esp slide #5/11. Is ThinkFree "the compatible" alternative to Office? The first three bullets on slide #9 are interesting as well.
A company offering professional support should be able to produce a professional advertisement. Likewise a poor presentation suggests a lack of attention to detail. Not a good reference.
Still no size of the zipped spec?
I asked someone - Gray Knowlton - why the zipped version of a '97 doc was smaller than the same-version of an '07 docx, given that zip is used to compress it and smaller file size was one benefit of MSO-XML.
The answer was somthing about robustness.
Then I thought - un-zip and re-zip the docx. Ta-da -20% smaller than the original .docx file and a little smaller than the zipped '97 .doc.
So I asked what was left out of the .docx file that was in the '97, as compression is limited by the true information content.
No answer from Gary.
Weren't you sarcastic to begin with - "About the length of the spec when zipped, I'm sure that you know how to run WinZip ;)?" Complete with emoticon and all. It looks intended to be demeaning.
The ratification question is still up for answer also. Java has no ISO spec, but it seems to do as well as MSO-XML would have done by simply publishing MSO-XML via the Web.
Again, why was ratification so important?
Dave S., I hate to butt in, but Eric pretty clearly answered your concern about whether it's better to store documents as PDF or in an editable format: `I think that your assertion regarding binary format files would be better preserved as PDF is not correct. It certainly is not what we hear from our customers.'
Obviously these are just opinions. If Eric were to provide a reference to a plausible sample survey done among customers, we could see for ourselves whether they prefer to edit documents or keep them archived forever. But then if it's the latter, where's the need for a format like ODF?