This word has been tickling my head as I read through several of the recent web services threads -- the 'where's the real contract' thing, the 'SOAP is just an RPC protocol with angle brackets' thing, the proper role of tools in this whole equation, and even the entire foundation of the SOAP/REST debate. Bear with me a little bit here, because I promise I will tie this back to web services eventually.
A few definitions to start things off
Isomorphism is a $10 mathematical word that has been on my mind lately.Webster's defines it thusly:
1 : the quality or state of being isomorphic : as a : similarity in organisms of different ancestry resulting from convergence b : similarity of crystalline form between chemical compounds
2 : a one-to-one correspondence between two mathematical sets; especially : a homomorphism that is one-to-one -- compare ENDOMORPHISM
However, the less technicaldefinition on Wikipedia is better for what I have in mind:
If there exists an isomorphism between two structures, we call the two structures isomorphic. Isomorphic structures are "the same" at some level of abstraction; ignoring the specific identities of the elements in the underlying sets, and focusing just on the structures themselves, the two structures are identical. Here are some everyday examples of isomorphic structures.
- A solid cube made of wood and a solid cube made of lead are both solid cubes; although their matter differs, their geometric structures are isomorphic.
- A standard deck of 52 playing cards with green backs and a standard deck of 52 playing cards with brown backs; although the colours on the backs of each deck differ, the decks are structurally isomorphic — if we wish to play cards, it doesn't matter which deck we choose to use.
- The Clock Tower in London (that contains Big Ben) and a wristwatch; although the clocks vary greatly in size, their mechanisms of reckoning time are isomorphic
The Canonical Form
The undercurrent of isomorphism is that we have many different ways of talking about the same thing, all of which are basically equivalent on the level that we're interested in. When discussing sets of isomorphically related things, sometimes its useful to bring in the idea of canonical form -- sort of a default representation that makes thinking about these sets of things a lot easier. There's nothing intrinsically superior to the canonical form (it's no more equal than any of the things it's isomorphic to); it's just an agreed-upon way of talking about a whole class of essentially equal things without introducing a lot of excessive verbiage into the discussion. One of the nice things about canonical forms is that you're free to think in whatever isomorphic system that's to your own liking. Only when discussing things with others does the canonical form become an aid; it's a whole lot easier to use the standard language because everyone else also understands the canonical form, there's a much better chance they'll understand what you're talking about.
Distributed computation is a domain rich with isomorphism. There are many different ways we can think about distributed interactions between computational systems. For example, RPC and Messaging havealready been shown to be isomorphic models of the same thing. There are similar dualities between "messages sent to a stateful service" and "methods called on a stateful object". All of these ideas are just attempts to build a conceptual model around the interactions between distributed systems. Unfortunately, each of these thought-models carries with it a certain amount of implicit semantic baggage, and that fact has really hampered the development of scalable, widely interoperable distributed systems to date.
The big idea being thrown out right now is that we use (SOAP) message exchange as the canonical form for modeling distributed system interactions. This idea now has a name - MEST (as well as a somewhat unfortunate homonym). There are many other isomorphic ways of thinking about distributed computation, and implementers are free to think of the world in whatever way seems best to them. But when it comes time to talk with others, it helps to fall back to the canonical form because there's a shared understanding of what the canonical form entails [1].
In search of the 'real' contract
The idea of canonical form comes into play when you think about contracts and what those contracts describe. For example, let's say that I've built an e-commerce that you want to consume. I give you some WSDL/XSD/Policy files that describe my service in various ways. However, what I'm really trying to communicate to you is that when you send me this SOAP message:
<S:Envelope xmlns:ht="http://schemas.hyperthink.net/" ... > <S:Header> ...<wsa:To>http://hyperthink.net/xml</wsa:To><wsa:Action>urn:GetCart</wsa:Action><wsa:MessageId>uuid:123...</wsa:MessageId><wsa:ReplyTo> ... </wsa:ReplyTo><ht:Service>CommerceService</ht:Service><ht:SubscriptionId>12ABC</ht:SubscriptionId> ... </S:Header> <S:Body> <ht:CartId>47</ht:CartId> <ht:HMACToken>[blah]</ht:HMACToken> </S:Body></S:Envelope>
Some time later, you'll get back a message that looks like this:
<S:Envelope xmlns:ht="http://schemas.hyperthink.net/" ... > <S:Header> ...<wsa:To>http://you.com/</wsa:To><wsa:Action>urn:CartGetResponse</wsa:Action><wsa:MessageId>uuid:974...</wsa:MessageId><wsa:RelatesTo>uuid:123...</wsa:RelatesTo><ht:RequestProcessingTime>1</ht:RequestProcessingTime> ... </S:Header> <S:Body> ...<ht:Cart>...<CartItem> <CartItemId>U13XQPNTY5WOEJ</CartItemId> <MerchantId>ATVPDKIKX0DER</MerchantId> <Quantity>1</Quantity> <Title>Nonzero : The Logic of Human Destiny</Title> <ProductGroup>Book</ProductGroup> <Price> <Amount>10.5</Amount> <CurrencyCode>USD</CurrencyCode> <FormattedPrice>USD10.5</FormattedPrice> </Price> </CartItem> ... </ht:Cart> </S:Body></S:Envelope>
If you want to talk about what the 'real' contract is, that's it right there. Considering that this is but one possible instance of many similar message exchanges, describing all of them explicitly would be arduous. It's good to have description languages, but they're still just abstractions over the SOAP messages they describe. It's also good to have a canonical form for description languages (i.e. WSDL/XSD/Policy). However, I could describe exact same message exchange in terms of some other description language using a Relax-NG schema and it wouldn't affect the outcome so long as the messages themselves did not change. It's a fine point, but the real contract is formed by the messages themselves.
Isomorphism and the local programming model
So, an open question to the audience: are the SOAP messages I described above more "RPC style" or "document-oriented?" Are these services talking to each other, or stateful objects? Are you a good witch or a bad witch? To all these questions and others I answer: mu.
Looking just at the RPC issue for a moment, would your answer to this question change if your local programming model initiated this message exchange using this construct:
Cart GetCartContents( int cartId, string token );
What if you did this instead:
CartResponse GetCartContents( CartRequest request );
Or take your pick of any of these:
XmlReader GetCartContents( XmlReader r )
SoapEnvelope SendRequestResponse( SoapEnvelope e )
byte[] SocketWriteRead( byte[] r, Socket s )
Or even these two paired together:
void SendGetCartRequest( CartRequest request )
CartResponse ReceiveGetCartResponse( IAsyncResult r )
There's six different programming models right there, and I didn't even take into account the ones you get when you start mixing and matching styles. My point is that the local programming model is just that - local. As long as it talks to others using the canonical form of SOAP messages on the wire, it's completely irrelevant to the outside world. The preference for one style over another is one of personal taste and aesthetics; what matters is that the local programming model remains isomorphic to the agreed-up default, it makes no difference if you use a different local programming model from everyone else. As long as the wire-level expectations are met, how you treat those messages before you send them and after you recieve them has no effect on ability to interoperate with others.
It's interesting to look at the "Web Services != Distributed Objects" issue in the context of isomorphism. Based upon the idea that 'all local programming models are created equal', it would seem that distributed objects are a perfectly reasonable way of looking at web services. I think that's true to some degree; as long as you recognize that those 'distributed objects' are an illusion of your model and don't exist in any real way beyond your network port, distributed objects are as good a model as any other isomorphism. In a sense, this holds true for all local progamming models, because they're all illusory to the same degree that their all 'real' in your own world.
However, this is not to say that web services are just CORBA-with-angle-brackets. Previous distributed object systems focused on establish a common local programming model and treated the mechanism of inter-object communication as an implementation detail. The late arrival of a standardized inter-ORB wire protocol to the CORBA scene is, I think, evidence that wire-level interoperability was a secondary concern for them. The current web services efforts flip this idea on its head, essentially saying wire-level interoperability is the only thing that matters. In this respect, web services are drastically different from previous distributed object stacks.
To the future...
The implication of isomorphsim is that there's a difference between internal representation and external representation. It's the idea that I can take the same thing and see a SOAP message if I look at it one way and an object if I look at it differently. Keeping two different conceptualizations of the same underlying thing in mind at the same time can take a while to get used to. I think this is the single biggest leap one has to make in order to adapt themselves to thinking about web services.
Helping people to bridge this conceptual gap between internal and external views. While an ideal world would delegate the job of managing the external view completely to the toolset, the reality is that there will be a need to consciously affect your external presence on the wire. A good stack should hande the bulk of this work for you, but still allow you to diddle with the angle brackets if you need to. In this respect, I really like how Indigo has approached the problem. In Indigo, the internal programming model is defined in terms of CLR types, the external metadata is defined via custom attributes, and the two are allowed to vary independently. Others may disagree, but I find this techique ofexplicit declaration to be pretty clean in the end. I'll say this -- I have yet to see a better approach.
A final word
I wanted to conclude this piece with one last isomorphism that colors quite a bit REST/SOAP debate, at least in terms of REST-in-practice. I have a confession to make; the SOAP examples I use above were carefully crafted to illustrate a point -- they're actually isomorphic representations of requests to Amazon.com's RESTful web service API. I simply took the elements out of the query string, stuck them into the appropriate places in the SOAP envelope and added a few WS-Addressing headers. For reference, the URL Amazon gives as a sample for this API looks like this:
http://webservices.amazon.com/onca/xml?Service=AWSECommerceService&SubscriptionId=[YourSubscription IDHere]&Operation=CartGet&CartId=[A CartID]&HMAC=[An HMAC Shopping CartToken]
And the API reference docs for the CartGet operation are locatedhere.
Amazon.com's e-commerce implementation is one of the most successful and highly touted REST API's in existence today. The implications of the fact it's also 100% structurally isomorphic to SOAP + WS-Addressing are left as an exercise to the reader.
--------------
[1]
(A brief philosophical interlude)
For a while now, I've had this gut feeling that the move toward service-orientation represented the postmodern evolution of software development, but could never really put my finger on why this was the case until a few minutes ago. It has to do with the idea that the canonical form is just another equivalent representation of an isomorphic system. The fact that the canonical form is canonical doesn't mean it's somehow "better" or "more accurate" than any of its isomorphic brethren - it's not special just because it's canonical. For example, the fact that Big Ben is not the canonical form of a 'timepiece' does not mean that Big Ben is any less of a timepiece than, say, a wristwatch. Specifically (and this is where philosophy comes in), the canonical form is not some neo-Platonic idealized representation of a 'thing' that all the other 'things' are aspiring to be. Its canonical nature arises by convention, not from some inner philosophical superiority. The recognition that the canonical form is on equal footing ontologically with all of its isomorphisms is a rejection of the modern neo-Platonic ideal; hence, the postmodern substitution of 'truth' for the much fuzzier idea of 'constructed convention'. I may be stretching this but it seems to make sense to me :)
(We now attempt to catch the original train of thought, already in progress)
