The importance of Content-Type
One of the amazing things about the web is that it's a vast interconnected graph of heterogeneous data that can be traversed in a generic way.
Half of this story is enabled through a uniform interface for moving the bits around -- URI's and HTTP methods. The other half is about uniform formats for entity bodies so systems can process data once it gets to them. One without the other is like a yin without a yang.
One of the nice things about the web is that the base mechanism for moving bits around (URI's and HTTP methods) doesn't much care about the format those bits conform to. Rather, there's a general notion of metadata (MIME type) and a well-known location for finding that metadata (the HTTP Content-Type header). And that turns out to be enough to at least bootstrap the whole system.
If you look at the MIME types in use today, some of them are pretty concrete:
- text/html - the standard format for projecting data about how to render user experience
- application/xhtml - another vocabulary for talking about UX
- image/* - a bunch of concrete formats for getting pixels to appear on the screen
While there are many concrete formats, there are many abstract meta-formats:
- text/xml and application/xml - Good old POX. If you get one of these content types, you know that the abstract data model is fixed but the concrete token space (the set of element qnames) is unspecified.
- application/json - similar to XML, except the data model is atom/record/sequence instead of element/attribute
- application/atom+xml -- constrains POX to list structures and guarantees some well-known metadata about items in the list.
- application/soap+xml -- constrains POX by adding a notion of "headers" and "body"
There's also the grand daddy of them all - application/octet-stream, which says you're allowed to intepret the following stream of bytes as just that -- a stream of bytes.
MIME types (and support for pluggable handlers for specific MIME types) are what enables the web to deal with heterogeneous data without constraining the set of possible data types that might be exchanged.
Formats and the Web Programming Model
WCF interacts with data formats at a couple of different layers in the stack -- both at the encoder level and serializer level (I wrote a long time ago about the difference between the two). The Web Programming Model has new features at both levels to deal with all sorts of different data formats.
At the binding layer, the WebHttpBinding can read and write three different kinds of data:
- XML
- JSON (requires Orcas -- not in the CTP)
- Opaque binary streams
Practically speaking, this means we can handle any type of data you throw at us. Worst case, you end up programming against System.IO.Stream. The PhotoFeeds sample in the BizTalk Services SDK has a nice example of this at work.
At the programming model layer, one of the big new features is the managed object model for Atom + RSS -- this makes it easy to produce and consume syndicated data from managed code. You can use the object model stand-alone without WCF, or compose it with [WebGet]/[WebInvoke] for networking.
There's lots of other more "behind the scenes" features that I'll cover in due time.
