The easiest way to have a look at the Longhorn SDK is to check it out here on MSDN. The entire site you see there is built using the tools developed internally by the Longhorn SDK team.

The 'big picture' process is pretty simple - a tool called “Olympia” uses .Net reflection to query the metadata in all the Longhorn assemblies then generates a set of individual XML documents representing the entire Longhorn API set. These documents are skeleton XML documents essentially devoid of authored content. At this point, each document contains some metadata about the API syntax and the document itself, a <content> section where authored content can be added by a programmer/writer, and a section that contains data about the status of the document used by writing managers to track the progress of the documentation set. At this point the status of the document is 'created by Olympia'. Olympia does contain code to integrate content from external XML sources such as the developer XML comment files generated by Visual Studio using the /doc compiler switch, but I'll discuss more about this later.

So, how does a programmer/writer add content to these newly generated XML documents and have that content persist from day to day? We maintain a source control server that contains a static set of XML documents. Every day after Olympia runs, a process called “OlympiaDiff” follows behind - this process examines each document in source control and compares it on a signature-by-signature basis with the newly generated document produced by Olympia from the latest Longhorn build. OlympiaDiff queries the relevant sections in each document to determine if there's been a change in the API in Longhorn. If it discovers a difference, a log entry is generated. Each day the programmer/writers check out the OlympiaDiff log to determine if there's been a change in the API's they maintain, and there's some automation to hopefully make it easier for them to move their authored content to the correct document. The assumption here is that the authored XML content in source control is always the most accurate representation of the Longhorn API.

Each day the latest set of XML documents are copied down from source control, then the build and transform tool named “Red October” does some slicing and dicing on the XML data and performs an XSLT transform to build the documents you see on the Longhorn SDK site. There's a lot of data manipulation going on at build time - dereferencing links, building tables of information, building the dynamically generated content on the main portal pages, etc. etc.

The process is obviously a little more complicated than this in practice, but that's pretty much it.