I meant to continue the discussion on Software Factories with a response to Darrell’s post, but I didn’t get to it before I left on vacation.Darrell’s argument is that Software Factories seem like they would work for only for narrow vertical market applications, thereby limiting their appeal to the mass market. I both agree and disagree with him; yes, the idea is most compelling when applied to narrow domains but no, I don’t think that’s a bad thing. In fact, I think constraining Software Factories to a specific domain it a critical factor to the success of such an endeavour.The problem solved by Software Factories (and programming in general) is essentially one of modeling. A programming language is effectively a modeling language, and I think that attempts to differentiate between the two are ultimately unsuccessful. One thing that we’ve learned about modeling languages is that there is a tradeoff triangle between efficiency of representation, generality, and precision. That is, it seems possible to build a modeling language that can efficiently describe a large number of disparate concepts imprecisely. It’s also possible to construct a language that can efficiently model a small set of concepts with a high degree of fidelity. Constructing an uber-language that efficiently achieves both generality and precision seems to be beyond our reach at this time.A model is a system in one formal domain that describes a roughly equivalent system in another formal domain. I say “roughly equivalent” because depending on the systems in question there may significant differences between the two. However, these differences are usually unknowable (e.g. in physics, we really have no idea what’s “really” going on at the subatomic level, but we’re satisfied enough with the mathematical description of quantum behavior to equate mathematics with reality) or irrelevant to the task at hand (e.g. relativity does not fully describe the world because it does not include quantum effects that we know to exist, but it seems to be “good enough” so long as we only think about it when we happen to be going very, very fast). In software, we build models with the specific intent of disregarding certain facets of the underlying system as being irrelevant – we use models to reduce complexity. A good software model is both efficient and precise. An efficient model expresses equivalent concepts from the underlying system, but does so using a compressed representation. Efficient models provide productivity gains because they allow you say the same thing, but with fewer words. The precision of a software model is defined by the ability to convert between the model and its equivalent system via a set of automated deterministic transformations without losing semantic content (this is usually a directed process – going from “most efficient” to “least efficient” is easy, but going the other direction is impossible to do with full fidelity). The software world is already full of models that are both efficient and precise. Consider a C++ program, for example. A program written in C++ is an efficient and precise model for an equivalent system implemented in C. Similarly, a C program is an efficient and precise model of an equivalent system in assembly, which is itself a model of a system in machine code (and, as Ian Griffiths once pointed out to me, it’s possible to continue this line of reasoning all the way down to quarks and beyond). You could also extend this in the other direction and consider a C# program as a model for an equivalent system in C++ (the theoretical compiler that implemented this transformation would be required to emit the entire CLR during its code generation phase – the fact that we can precompile the CLR is simply an optimization and does not alter things from a modeling perspective). The big question is, from a modeling perspective, what does the layer of abstraction on top of C# look like? I think it’s this question that the Software Factories guys are trying to solve. I don’t think anybody knows exactly what this type of model will look like, but I think one major feature is efficiency and high precision with respect to an equivalent system implemented in C# (or any other language targeting the CLR). That is, a model in our theoretical modeling language should be compilable down to an equivalent representation in our CLR language of choice. And herein lies the challenge, because the inherent tension between efficiency, generality, and precision comes into play. There are a few approaches that probably lead to failures:
And a couple that might take us down the road to success:
Taking the idea of Software Factories out of the theoretical world and putting it into play in the enterprise application space is a nontrivial problem. Given that we have to choose a subset of all possible problems to model, what should this subset look like? Certainly, dividing applications by vertical market is one way of partitioning this space – you could conceivably build a modeling language for financial services apps, for example. However, I think it would be more valuable to partition according to structure, not necessarily function. By identifying structural similarities within enterprise apps we can begin to distill patterns, and from these patterns we can create a language for modeling these patterns efficiently without compromising precision unless absolutely necessary. When such a compromise is unavoidable, we must look for ways to factor that out of the model completely, so developers must only operate in the system domain or the model domain but never both at the same time. If we do that, we’ll have circumvented a lot of the problems encountered by previous attempts to raise the level of abstraction in software development.