Channel 9 has two new videos of Scott Guthrie demonstrating features in the upcoming releases of Visual Studio 2010 and ASP.NET MVC 2. These two long videos are from a recent session in the Netherlands and basically provide a half-day overview and demonstration of the products. Although there is some continuity between the two, they're both also easily watchable on their own.
Now that I've covered the essentials of the binary format, those interested might want to try their hand at translating an encoded message. This message uses many of the constructs you've seen plus a few more I'll outline here.
You should be able to find everything else you might need in past parts of the series:
Now, here's the message to work on.
0x56, 0x02, 0x0B, 0x01, 0x73, 0x06, 0x0B, 0x01, 0x61, 0x04, 0x56, 0x08, 0x44, 0x0A, 0x1E, 0x00, 0x98, 0x01, 0x31, 0x98, 0x01, 0x61, 0x01, 0x01, 0x56, 0x0E, 0x98, 0x07, 0x4D, 0x65, 0x73, 0x73, 0x61, 0x67, 0x65, 0x01, 0x01
There are 14 records in this message and you'll first probably want to find all of the record boundaries. For each record, you need to determine its type and then based on the record type, the number of trailing bytes that are part of the record.
Past parts in the series:
We looked last time at some of the patterns used in the binary format for reducing the size of a document. So far we'd managed to trim about 12 bytes off of the Envelope element that I'd been using as an example. However, we can still do better given that we'll likely be sending that same element again and again.
String tables are a way of compressing repetitive text. The table associates a token with each string. The writer of the message replaces a portion of the message as it is being output with the token. The reader of the message receives both the encoded text as well as the string table, allowing them to reverse the process.
There are many ways that the string table might be communicated. For example, the table might be communicated along with the text in a header, as is typical for compression programs. The simplest mechanism though is to simply assume that the reader and writer have exchanged the string table ahead of time through some out of band exchange. Then, there's no need to represent the string table or any information about it during the message exchange.
We sometimes use the binary format together with a signal for such an implied use of a string table. This is called the static string table. Our static string table is populated with a variety of strings related to SOAP messages. For example, here are the first eight strings in the table (out of many dozens):
These strings in the static string table are given tokens following the even positive numbers. Therefore, the list I gave you represents strings for the tokens 0, 2, 4, 6, and so on.
Now, I can introduce a third (and fourth) kind of record for declaring an element. A dictionary element record is similar to an element record except that the length of the element name and the bytes for the element name are replaced by the value of the token. Similarly, there are prefix dictionary elements like I introduced last time that allow us to skip both the prefix string as well as the element name string.
The record types 0x44 through 0x5D represent prefixed elements for the prefixes that are the lowercase letters "a" through "z". We can now shrink the element record I've been using as an example even further. The string Envelope has the token 0x02 so we can write the s:Envelope element record now in merely two bytes:
0x56, 0x02
Tomasz Janczuk has posted another example using the HTTP polling duplex channel in Silverlight to build pub-sub style applications. This sample addresses the use of polling in a scaled-out configuration.
The solution employed is to move the server queue of messages from in-memory state to a shared store. The service also has to take more of the polling logic on directly to handle the fragmentation of data across sessions, introducing some additional limitations into the contract of the service.
This solution uses a variation of the WS-MakeConnection protocol to handle the service-level responsibilities of polling. This type of bootstrapping for connection establishment is actually very similar to how we came to the original design of the polling protocol itself. I suspect that if the server-side queue was moved to a remote store, server processes could come and go using the existing connection reestablishment of the protocol to resume both what the client and server were doing. That is also similar to (but much simpler than) the messaging services of BizTalk and what we're doing with long-running workflow services in .Net 4.0. Those two approaches both remote the storage of messages to a durable, commonly-located place apart from the application logic. I'll be interested to see how people take to using polling duplex, particular in solutions that are transparent to the original service.
We've got a new article up on MSDN by Aaron Skonnard that covers the most recent preview release of the WCF REST Starter Kit. The download for REST expands on the capabilities we added in Orcas for building web services with REST protocols with features such as project templates, extensions to WCF, and extensions to HTTP. Aaron has been writing quite a lot of material for us and this isn't his first article on the REST capabilities of WCF. If you need more of an introduction to building REST services with WCF, then there are past guides on that topic as well.
The problem we saw last time was that a structural reduction for message fragments does not create a significant savings when the message is small. Although we are shaving a few bytes off of each element (the savings on closing an element is 2 bytes plus the length of the element name), the number of elements is small. On the other hand, there is a lot of boilerplate content in a SOAP message, which ends up dominating the size of the message. We can use that knowledge about the structure of the message to make the encoding more efficient.
Let's take another look at the first record we saw last time, the s:Envelope element record.
0x41, 0x01, 0x73, 0x08, 0x45, 0x6E, 0x76, 0x65, 0x6C, 0x6F, 0x70, 0x65
This element record has three common patterns in it that we can try to exploit.
The first common pattern is something that we're already doing without me having explained it. You'll notice that we've represented the lengths of each string in this example with a single byte. We'll frequently have very short strings like these but we will occasionally need to have longer strings as well. We don't want to pay for the maximum size of a value in every record though as we'll have a lot of these size fields throughout the message.
We are using the common encoding trick of a variable-sized integer here. We start without assuming how long the integer is. If the value is between 0 and 127, then we'll store it as a single byte as you'd expect. If the value is between 128 and 16383, then we'll set the first bit to 1 and take seven bits from the value to form the first byte. The second byte will have the remainder of the bits from the value. We can keep doing this expansion for more and more bytes by always using the high bit of each byte to say whether more bytes are coming or whether this is the last byte.
The second common pattern is that it will be very common to have a short prefix for the element. Instead of storing that prefix as even a short string, we can use some of our record types to precompose the element record with the prefix. We don't have an unlimited number of record types so we can't do this for every prefix but we have done it for the lowercase letters "a" through "z" since it's very common to use these single-letter prefixes. These are the record types 0x5E through 0x77. For example, the same s:Envelope element record as above can shave off another two bytes by writing it as:
0x70, 0x08, 0x45, 0x6E, 0x76, 0x65, 0x6C, 0x6F, 0x70, 0x65
The third and final common pattern is that almost every SOAP message will have an element called Envelope in it. We'll spend the remainder of the series looking at static and dynamic string tables to intern these common names.
Now that you’ve gotten an introduction to the principles and capabilities of the binary encoding format, let’s jump into looking at some examples of messages to see how it works. Here’s a very short but inefficiently encoded binary message. We’ll see later how to make it quite a lot shorter.
0x41, 0x01, 0x73, 0x08, 0x45, 0x6E, 0x76, 0x65, 0x6C, 0x6F, 0x70, 0x65, 0x09, 0x01, 0x73, 0x27, 0x68, 0x74, 0x74, 0x70, 0x3A, 0x2F, 0x2F, 0x77, 0x77, 0x77, 0x2E, 0x77, 0x33, 0x2E, 0x6F, 0x72, 0x67, 0x2F, 0x32, 0x30, 0x30, 0x33, 0x2F, 0x30, 0x35, 0x2F, 0x73, 0x6F, 0x61, 0x70, 0x2D, 0x65, 0x6E, 0x76, 0x65, 0x6C, 0x6F, 0x70, 0x65, 0x01
This is admittedly unenlightening so let’s start parsing out the pieces of the message. Here’s the procedure I described earlier in part 2:
Each record starts with a one byte record type value. The record type byte is then followed by binary content of variable format and size based on the type. Each record in the stream of records translates into a document fragment. By concatenating all of the fragments produced from the record stream together we can obtain a document based on the original XML infoset.
The first byte in the message, 0x41, is the record type for an XML element. The format of an element record is the length of the prefix string, the bytes for the prefix, the length of the element name, and the bytes for the element name.
The next byte in the message, 0x01, is the length of the prefix, which means that 0x73 are the bytes for the prefix. If you go to your standard UTF-8 or ASCII character table, you’ll see that 0x73 is the byte sequence for the letter "s".
Similarly, the next byte in the message, 0x08, is the length of the element name, which means that 0x45, 0x6E, 0x76, 0x65, 0x6C, 0x6F, 0x70, 0x65 are the bytes for the element name. Translating that byte sequence into characters we get the letters "Envelope".
That means we’ve just processed a record that gives us the XML fragment "<s:Envelope". We don't know yet whether we've seen the entire XML element yet. For example, the following records could be attributes or just as easily could be the content within the element. We'll find out the context when we get there.
The next byte in the message, 0x09, is the record type for an XML namespace declaration. Recall that we used the prefix "s" with the envelope element so we do owe the reader a definition of that namespace. The format of a namespace record is the length of the prefix string followed by the bytes for the prefix.
The next byte in the message, 0x01, is the length of the prefix, which means that 0x73 are the bytes for the prefix and we indeed are now defining that prefix we saw earlier. Then, the next byte in the message is 0x27 indicating that we have 39 bytes of character data coming. That character data forms the string "http://www.w3.org/2003/05/soap-envelope".
The XML fragment for this record is "xmlns:s="http://www.w3.org/2003/05/soap-envelope"".
Finally, the next and final byte in the message, 0x01, is the record type for closing an element. There's no ambiguity about the element to close because the binary encoding only supports well-formed XML and we always know the most recently started element. Therefore, the end element record doesn't have any data.
The XML fragment for this record is "</s:Envelope>".
That was the last record in the message so we can put the fragments together to see that the message spells out a very common portion of a SOAP message:
<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope"></s:Envelope>
It would be nice to not have to carry all of that character data because the same strings are going to be in virtually every SOAP message that is sent. There was virtually no size advantage to using the binary format in this case as compared to a text encoding. Next time we'll look at how to squeeze down the size of the message.
Another CTP release is coming for the .Net Services cloud offering in October and again there are some changes to their existing web service offerings. The team is changing their focus from primarily SOAP-based services to REST-based services and simplifying the product and protocols.
It’s not specifically mentioned but I’m going to assume that they’ll continue to be based on WCF 3.5 until after we release .Net 4.
Here’s a look at what’s being added as well as what’s being removed from their existing service offerings:
The complete announcement for the October 2009 .Net Services CTP is on their team blog.
Today I’ll talk about the XML features that are and aren’t supported by the binary encoding format we use in WCF.
Since the binary format was designed for a specific purpose, round-tripping essentially the XML infoset being manipulated in memory as opposed to round-tripping the rendered XML documents, several features that are only relevant at the level of a rendered document are omitted. Similarly, features that only have significant differences from other features for rendered documents are omitted to canonicalize the representation.
Here’s the general list of XML features that are not supported:
That leaves almost every other XML feature you might think of as supported by one record type or another. The list includes structural features, such as elements, attributes, namespace declarations, and comments. The list also includes content features, such as booleans, integers, floating-point numbers, fixed-point numbers, strings, dates, time spans, byte arrays, guids, unique identifiers, and qualified names.
The encoding tricks of the binary format are primarily through the choice of supported record types, having variable-sized integers to reflect that most of the needed values are small, and using numerical references to interned strings rather than repeating the contents of the string each time it is used. Going over some examples of records next time should illustrate these common features.
The fifth preview of the new ASP.NET AJAX features for future versions of the framework is now available. The AJAX release consists of server-side controls and programming model for building ASP.NET applications that offer functionality through AJAX requests and client-side scripting libraries for using AJAX applications in a variety of web browsers. This release includes past preview features, such as integration with web service, as well as new preview features such as templating support.
The single download includes samples, reference documentation, and release notes.
The binary format we developed is based on a tokenized stream of records and a few Huffman-like coding strategies. Each record starts with a one byte record type value. The record type byte is then followed by binary content of variable format and size based on the type. Each record in the stream of records translates into a document fragment. By concatenating all of the fragments produced from the record stream together we can obtain a document based on the original XML infoset. It’s relatively simple compared to many binary XML formats while still being highly expressive.
Here are the main properties of interest:
The Windows Management Framework has put out a release candidate porting some of the management features in Windows 7 to versions of Windows from Windows XP to Server 2008. The management framework includes PowerShell 2.0 and the Windows implementation of WS-Management, which is a SOAP-based protocol for accessing and exchanging management information.
This release candidate was previously only available for Windows Vista and Server 2008, but is now available for Windows XP and Server 2003 as well.
This series on the .Net Binary Encoding protocol is going to be similar to the earlier series I did on .Net Message Framing. The two are also somewhat related as they’re used together frequently and the messaging framing protocol has direct knowledge of some options for binary encodings. The two are also used apart regularly though.
I’ll first go over some background of what the binary encoding protocol does and why we created it. Then, I’ll have a few parts covering the details of the protocol. The binary encoding protocol series is a bit shorter than the one on message framing because it has fewer concepts, but there are still a number of challenging sections and I’ll have a bit more background to get through.
One of the originally intended properties of XML was to have a human-readable and self-describing format. A substitute for human-readable might also be to allow for programs that consume data without using the originally intended decoding program or even perhaps long after all of the software originally written for the data has been lost. It’s debatable how well XML achieves those properties in practice, but it’s certainly true that well-formatted XML is quite a lot more readable than many forms of unstructured data or data whose content is obscured by its encoding.
In some cases though, an XML document is exchanged, processed, and then thrown away with no intention of ever being read by a person or preserved for the future. In these cases, the typical text-based encoding of XML becomes more of a liability than a benefit as the space and processing power required to encode the document greatly exceeds that of traditional binary or record-oriented protocols.
There’s no requirement that an XML document have a text-based encoding though, and a variety of binary XML protocols exist as a compromise between the two extremes. No one binary XML protocol appears to be winning in the market and there does not appear to be a convergence of the proposed options over the last ten years of discussion (the usually recognized standards bodies have each selected their own format).
Partially, this market fragmentation exists because it’s hard to design a binary XML protocol that optimizes all of the large variety of needs for such a protocol. For example, a bandwidth-constrained application might require the smallest encoded document size while a mobile device application might require low power processing. While the mobile application might also appreciate a small document size, the computational power required to compress a document increases greatly as you approach higher levels of compression.
We invented our own binary XML protocol with the intention of servicing a variety of needs in WCF. I doubt that everyone or even most people would want this to be the only binary XML protocol used but I do think some will find it useful even outside of WCF. Next time I’ll go over the basic idea of the protocol and what our needs were.
Over on Channel 9 George Moore has a video talking about the commercialization and billing aspects of Azure cloud services that is aimed at developers. In particular, the video covers the background and basics of all the things that developers don’t need to know about because of the way the system is structured to avoid putting billing into the face of developers. This model is used across all of the Azure cloud services, including for Windows, SQL, and .Net services.
If you’ve tried using the HttpListener API to build a web server, then you may have noticed that many runtime errors come back as wrapped Win32 errors rather than different exception types. Since HttpListener doesn’t say what specific Win32 errors might occur and the underlying HTTP Server API mostly points to the list of 15,999 kinds of errors defined by WinError.h, it can be a bit of a detective job figuring out what errors might occur and can safely be handled.
I happen to have come across some notes I made a few years ago while working on our HttpListener based web server for WCF. I don’t know how many of these have changed since then or how many were even correct in the first place, but here you go for entertainment purposes only. Included is the WinError.h definition of the error code and some additional notes were applicable.
Out of memory errors- you probably shouldn’t try to recover after these as something is seriously wrong.
Less fatal, but possibly still bad, errors.
Here are the past articles in the series to get up to date:
The last part in this series is to bring the history of the named pipe up to the named pipe implementation in WCF.
Although the .Net framework now has a named pipe implementation in System.IO.Pipes, the WCF named pipe implementation actually precedes that by a few years (this is why there is a type System.IO.PipeException that comes in System.ServiceModel.dll). That means we directly sit on top of the native Windows named pipe implementation I talked about last time, such as the CreateNamedPipe function in kernel32.dll. Named pipes are probably our second largest source for using native methods after the queuing functions in mqrt.dll used by the MSMQ channel.
The basic building blocks that we use for a named pipe implementation are:
The named pipes in WCF are intended for use as a fast, local machine communication mechanism. Although a Windows named pipe can be used to connect to other machines, we use the security permissions of the pipe to deny network connections. This is done by denying the special well-known Network SID S-1-5-2 (there was previously an option on CreateNamedPipe to do this directly but that was deprecated in favor of the current approach). From time to time you might see this cause a problem with a service that bridges external communication from one service to talk to a second service over a named pipe. The Network SID is added to the process token when a user logs on across a network, preventing that logon from later talking across one of our named pipes.
As you might have guessed from the supported channel shapes and the information above, we use bidirectional pipes with overlapped IO operations. Both reads and writes are done using the pipe stream in message mode. These messages don’t equate to the messages you think of in your application (that’s what the higher-level message framing is for) but are instead to have coherent exchanges along the pipe.
Today is a US national holiday, which generally means I don’t put up a post due to the drastically reduced number of readers. Here’s a look ahead at some of the topics coming this week though:
And, for those of you not in the US or Canada, here’s a short post to tide you over today.
Is it possible for an HTTP channel to fault?
The HTTP transport channel functions as a datagram channel. That means it effectively is a singleton implementation that should almost never fail. There are a few things that could go wrong to make an HTTP channel fault, but you still might not necessarily do anything about that. For example, a faulting HTTP channel could be caused by a bug in your code or a bug in our code (typically manifested as an InvalidOperationException). It’s probably best for your program to die so that the bug can be found and fixed.
There are also a few edge cases where even a singleton HTTP channel can run into permanently fatal trouble. There are universally fatal conditions, such as running out of memory or the app domain blowing up, but there are also a few fatal incidents unique to HTTP. One example on the client side is that the channel might fail to open when authentication is being used because the source of security tokens was unresponsive. An example on the server side is that the underlying HTTP listening service might have fallen down (in IIS hosting this probably would have killed the process before getting to us because the application process and HTTP listener are part of the same system but in self hosting the two are separate and can live independently).
We’ve got a new round of talks uploaded to the PDC site if you’ll be attending the conference.
Ed Pinto is signed up to do a talk on the new features in WCF 4, including the topics of discovery and routing that were asked about in response to my PDC survey earlier.
What’s New for Windows Communication Foundation 4 by Ed Pinto
Learn about the investments made in Windows Communication Foundation 4 that add new capabilities for service composition and reduced configuration and deployment complexity. Discover how improvements to configuration, monitoring, and deployment are enhanced by Microsoft project code name "Dublin". See how the Routing Service makes it easier to build sophisticated intermediaries and how support for WS-Discovery adds flexibility to your services infrastructure. Gain insight into the improved authoring experience for REST services applications including new support for caching, multiple formats, and fault handling.
We’ll keep workflow services separate in their own talk with Mark Fussell.
Workflow Services and “Dublin” by Mark Fussell
Learn how to use Windows Workflow Foundation (WF) 4, Windows Communication Foundation (WCF) 4, and “Dublin” to build and manage scalable, reliable, and highly-available applications. Discover the power of WF to build and coordinate WCF services and implement logic on the middle tier. Enable sophisticated messaging patterns with correlation, enhanced transaction support, durable services, and config-based activation. Learn how "Dublin" makes it easier to deploy, manage, and monitor WCF and WF applications.
There’s also a talk planned on WCF services as seen in Silverlight.
Networking and Web Services in Silverlight
This session will present an overview of how to expose data to a Silverlight application by accessing SOAP WCF services and REST services. In the WCF space, we will cover Silverlight 3 approaches for securing services and improving their performance and maintainability. We will also cover a specific message pattern called server push, which allows you to implement scenarios such as email clients and real-time chat. In the REST space, we will walk through the Silverlight 3 client HTTP stack and new functionality it offers around HTTP verbs, headers, responses, and cross-domain access and talk about future plans for networking and web services in Silverlight.
Finally, WCF will be appearing in a few applied talks by some groups that integrate with it for communication.
Windows Identity Foundation Overview by Vittorio Bertocci
Hear how Windows Identity Foundation makes advanced identity capabilities and open standards first class citizens in the Microsoft .NET Framework. Learn how the Claims Based access model integrates seamlessly with the traditional .NET identity object model while also giving developers complete control over every aspect of authentication, authorization, and identity-driven application behavior. See examples of the point and click tooling with tight Microsoft Visual Studio integration, advanced STS capabilities, and much more that Windows Identity Foundation consistently provides across on-premise, service-based, ASP.NET and Windows Communication Foundation (WCF) applications.
Accelerating Applications Using Windows HPC Server 2008
Learn how to accelerate your applications by multiple orders of magnitude using Windows Communication Foundation (WCF), Microsoft Excel, and Windows HPC Server 2008. See how easy it is to offload the calculations from a desktop application to an HPC Server Cluster using the HPC SOA programming model, with emphasis on performance tuning best practices.
For those of you that were interested in talks on WCF internals or making better use of WCF, have you seen my PDC talk from last year on WCF performance? What are other topics that you’d be interested in for existing features and versions of WCF? If there’s a lot of interest in a specific topic I’ll try to figure out a way for us to get you content on it.
Silverlight 3 has been updated with a small set of bug fixes and support for the GB18030 code page for developing Chinese localized applications. A GDR, unlike a hotfix update that you might get from product support, is intended for general use but is much more narrowly scoped and often much smaller in scale than the more widely known service pack releases.
As you might guess from the title, this is actually the second GDR update for Silverlight 3. I didn’t mention the first GDR because it was a runtime only update (it primarily was to fix a problem with media buffering). I am mentioning the second GDR because there are some small updates to the Silverlight 3 SDK that go along with it. The Visual Studio tools are not updated in this release.
If you need to check the version numbers of the various releases to see what you have, they are:
The other day I noticed Brad Abrams plugging the book Silverlight 3 Jumpstart (I’ve only flipped through it so I don’t have a personal recommendation on whether new Silverlight developers should get it). The thing that struck me about the Jumpstart book though is that it is very short for a book, just over 200 pages, while still quite a lot longer and more organized than what you’ll find from most online resources.
The basic flow of the text is like this:
That’s pretty much the whole book. You could read through it in one sitting if you wanted to. The average WCF book is 600 pages but covers far more topics at far greater depth, even at the introductory level.
Are books such as the Jumpstart book a missing niche for learning WCF?
Do you think that other resources, such as documentation, blogs, online whitepapers, and long format books, have helped you learn WCF more easily than a short format book would?
In the .Net framework there are a number of notions of versioning that we worry about for a release. Some of the traditional notions of versioning are backwards compatibility (the ability for your old programs and data to continue working in the same way on newer versions) and forwards compatibility (the ability for your old programs to handle features and data from newer versions).
There are also some other notions of versioning, including some new notions of versioning that we’ve been focusing on for .Net 4.
Side by side is the ability to install and run two different versions at the same time. The .Net framework has previously supported side by side within a machine. It was possible to install and use .Net 1 applications at the same time as .Net 2 applications. In .Net 4 I’ve mentioned in the past additional support for side by side within a process. There were some surprising complications here for WCF, where you might think decoupled services makes side by side execution simple, because we handle a lot of data serialization that subtly, or sometimes not subtly, differs between versions.
Multitargeting is the ability to use the most recent version to build applications for either newer versions or older versions. Visual Studio 2008 had limited support for multitargeting and this is much improved in Visual Studio 2010 (this again turned out to be more work than you might expect for a relatively tool-light framework). Scott Guthrie has a recent article on Visual Studio 2010 multitargeting that you might find interesting if you want to see plenty of examples.