Zen of the Web Programming Model (Part 1)

In our Mix talk, Don joked about how we were taking off the 'philosoper's robes' when it came to designing this feature set. If you're waiting for the "Four Tenets of REST Orientation" to come out of this work, I think you'll be waiting for quite a while. There are no tenets or orientations here -- just some practical stuff based on what people are doing on the web today. We've made some observations about what people are struggling with on the platform as it is now, identified their pain points, and hopefully given them some aspirin to make their coding lives simpler and more productive.

That said, I would get thrown out of CSD if we didn't have a few guiding technical principles backing up what we are doing. Also, I would hate to deprive you of the quasi-mystical experience you've come to expect from the Indigo team, so I'll posit that all of theory you need to know about our Web features is contained succinctly in the following paragraph (encoded in haiku for your pleasure (well, probably more mine than yours)):

with GET and INVOKE,
it's the URI, stupid.
yes, formats matter

The center of the haiku is about URI's, which also happen to be the center of the model (convenient, that) so that seems like a reasonable starting point.

It's the URI, Stupid

One thing is pretty clear: if you take URI's out of the web, the whole thing falls over.

URI's have some nice properties -- transcribability (the ability to be easily written down) being chief among them. The URI RFC goes into some depth on this feature and how it enables even the low end 'cocktail napkin' style of URI transmission between two humans. This has led to a couple of things -- number one, URI support is pretty ubiquitous throughout the application stack. URI's are supported directly in browsers and file system explorers, and URI-based mechanisms are the defacto app-level metaphor for referencing one piece of information from within another (I know I definitely use Word's Add Hyperlink feature a lot more than I use OLE these days). URI's have become so ubiquitous that folks are starting to optimize away the need for transmission-by-cocktail-napkin by designing systems around URI's that are easy for people reason about and remember directly. They leverage the hierarchical nature of URI paths to logical URI layouts where the hierarchy of the URI path reflects the organization structure of the underlying information space. 

Although the URI RFC and W3C Web Arch are fairly prescriptive about the intended usage of the URI, in the wild people end up using them in a lot of different ways. The usage spectrum runs the gamut from "Pure Opaque Identifier" at one end to "Eval-able structured expression" at the other. A URI like http://www.microsoft.com/downloads/details.aspx?FamilyID=5d9c6b2d-439c-4ec2-8e24-b7d9ff6a2ab2&DisplayLang=en is a good example of the URI-as-GUID school; although it may have some internal structure you're much better off if you just treat it as an opaque sequence of bytes and don't spend too much time trying to understand where it came from or what it means. Contrast that with a URI like http://flickr.com/search/?q=Bill+Gates&page=2 which, if you squint at it hard enough could almost be seen as

       (skip-to-page (search "Bill Gates") 2)

The similarity is much deeper than simple syntactic transformation -- there's a whole remote evaluation motif embedded here.

Given that I know http://flickr.com/search/?q=Bill+Gates&page=2 points to the second page, I'm pretty confidant I can come up with a URI that gets me page 3 of a search on 'Steve Ballmer' without having to read any docs or asking the server to construct it for me.

Of course, there are URI systems that fall in between the two extremes -- take http://www.microsoft.com/silverlight/. Can you treat that as an opaque identifier? Of course. But then again, I bet you could figure out what the URI for 'Windows' was without resorting to a search engine.

No matter where they fall on the GUID <-> sexpr spectrum, we've heard our customers tell us that they want their URI's to be more than just something they bolt onto the system at deployment time. Rather, they want to think about the URI structure of their system as being a primary thing and are finding that building server systems that handle a particular URI space in the way that they want is rather hard.

One of the challenges in building such systems is that URI space for a given service is practically unbounded. The set of all URI's a server might process might be data dependent (consider the case where you give every customer in your system a unique URI), which poses something of a notation problem. How do we write down an infinite set of URI's?

Common practice on the web today is to use some sort of templating syntax to punch holes into the URI at strategic places (much the way String.Format lets you punch holes into an arbitrary string at strategic places). Our URI template syntax looks like this:


The bits in {curlyBraces} are variables, which means that a URI Template is an efficient notation for an (unbounded) set of structurally equivalent URI's.

As part of the BizTalk Services SDK (and ultimately part of .NET 3.5) we're providing a new API for dealing with URI templates called (intuitively enough) System.UriTemplate. There are a couple of interesting things you can do with a UriTemplate once you have one:

  • You can call Bind() with a set of parameters to produce a fully closed URI that matches the template.
  • You can call Match() with a candidate URI, which extrudes the candidate URI through the template to give you back a lexical environment (a.k.a. a dictionary) with all data that poked through the holes in the template.

Bind() and Match() have this nice property of being inverses so that you can call Match( Bind( x ) ) and come back with the same environment you started with[1]. The code sample below give the basic gist of this API:

Uri baseAddress = new Uri( "http://localhost:81" );string artist = "Led Zeppelin";string album = "Four"; UriTemplate template =new UriTemplate("music/{artist}/{album}?format={format}" );Uri boundUri = template.BindByPosition( baseAddress, artist, album, "rss" ); //boundUri:// http://localhost:81/music/Led%20Zeppelin/Four?format=rssUriTemplateMatch match = template.Match( baseAddress, boundUri );Debug.Assert( match.BoundVariables["artist"] == artist );Debug.Assert( match.BoundVariables["album"] == album );Debug.Assert( match.BoundVariables["format"] == "rss" );

There are many times (especially on the server, where you need a reasonable way of doing URI-centric dispatch) that you want to keep track of a set of UriTemplates in a data structure that has dictionary-like semantics. In our world, that's what System.UriTemplateTable is for. It lets you select the best match given a set of templates and a candidate URI. This is basically URI-dispatcher-in-a-box -- and it's not coupled to any particular networking stack (WCF included) so you can use it wherever you want.

I'll give UriTemplate and UriTemplateTable a much deeper treatment in the next couple posts. For now:

uri templates
are just URI's with holes.
behold, .Bind() and .Match()!

Tomorrow is all about verbs.

[1] That's not always true -- there are evil characters that you can put in the variables you pass to Bind() (e.g. slashes) that System.Uri will special-case the escaping for, which will cause this invariant to fail. But unless you're pathological, this will hold in all the cases you care about.

Technorati Tags: ,
#1 Haacked on 5.08.2007 at 1:00 AM

Hi Steve. Is there going to be a way to "type" the holes? For example, I might have the following two templates."/archives/{year}/{month}/{id}/""/archives/{year}/{month}/{friendly-name}/"When an URL comes in like so:/archives/2007/01/123/How will the template matching know that 123 should match the first template and not the second. The reason I might want this is that I might have another dictionary lookup given the matched template which sends the URL to the right place.Obviously I could just do this in my application code. I was just wondering how far these templates go.

#2 Steve Maine on 5.08.2007 at 11:03 AM

There's a much easier way to do what Jef's trying to do: hyperthink.net/.../A+Brief+Aside+W

#3 Steve Maine on 5.08.2007 at 5:07 PM

Phil -- I'll cover this more in my post on UriTemplates, but the short answer is that if both templates are present in a template table, both templates will match.

#4 Marc Brooks on 6.21.2007 at 4:45 PM

For those that wonder about an implementation of UriTemplate and UriPattern that you can use right now, I've written a simple open-source implementation for .Net (1.1 or 2.0) that can read about here musingmarc.blogspot.com/.../uritemplate-pro