The Universal Design Pattern

Author: Steve Yegge. Link to original: http://steve-yegge.blogspot.com/2008/10/universal-design-pattern.html (English).
Tags: patterns, Programming Submitted by can3p 20.10.2008. Public material.

Translations of this material:

into Ukrainian: Ніби й не було. 0% translated in draft.
Submitted for translation by 050986 05.02.2010
into Russian: Универсальный шаблон проектирования. 14% translated in draft.
Submitted for translation by can3p 20.10.2008

Text

The Universal Design Pattern

This idea that there is generality in the specific is of far-reaching importance.

— Douglas Hofstadter, Gödel, Escher, Bach

Note: Today's entry is a technical article: it isn't funny. At least not intentionally.

Contents

* Introduction

* Three Great Schools of Software Modeling

o Class Modeling

o Relational Modeling

o XML Modeling

o Other schools

o Finding the sweet spot

o Property Modeling

* Brains and Thoughts

* Who uses the Properties Pattern?

o Eclipse

o JavaScript

+ Pushing it even further

+ The pattern takes shape...

o Wyvern

o Lisp

o XML revisited

o Bigtable

* Properties Pattern high-level overview

* Representations

o Keys

+ Quoting

+ Missing keys

o Data structures

* Inheritance

o The deletion problem

o Read/write asymmetry

o Read-only plists

* Performance

o Interning strings

o Perfect hashing

o Copy-on-read caching

o Refactoring to fields

o Refrigerator

o REDACTED

o Rolling your own

* Transient properties

o The deletion problem (remix)

* Persistence

o Query strategies

o Backfills

* Type systems

* Toolkits

* Problems

* Further reading

* Final thoughts

Introduction

Today I thought I'd talk about a neat design pattern that doesn't seem to get much love: the Properties Pattern. In its fullest form it's also sometimes called the Prototype Pattern.

People use this pattern all over the place, and I'll give you a nice set of real-life examples in a little bit. It's a design pattern that's useful in every programming language, and as we'll see shortly, it's also pretty darn useful as a general-purpose persistence strategy.

But even though this pattern is near-universal, people don't talk about it very often. I think this is because while it's remarkably flexible and adaptable, the Properties Pattern has a reputation for not being "real" design or "real" modeling. In fact it's often viewed as a something of a shameful cheat, particularly by overly-zealous proponents of object-oriented design (in language domains) or relational design (in database domains.) These well-meaning folks tend to write off the Properties Pattern as "just name/value pairs" – a quick hack that will just get you into trouble down the road.

I hope to offer a different and richer perspective here. With luck, this article might even help begin the process making the Properties Pattern somewhat fashionable again. Time will tell.

Three Great Schools of Software Modeling

Before I tell you anything else about the Properties Pattern, let's review some of the most popular techniques we programmers have for modeling problems.

I should point out that none of these techniques is tied to "static typing" or "dynamic typing" per se. Each of these modeling techniques can be used with or without static checking. The modeling problem is orthogonal to static typing, so regardless of your feelings about static checking, you should recognize the intrinsic value in each of these techniques.

Class Modeling

You know all about this one. Class-based OO design is the 800-pound gorilla of domain modeling these days. Its appeal is that it's a natural match for the way we already model things in everyday life. It can take a little practice at first, but for most people class modeling quickly becomes second nature.

Although the industry loves OO design, it's not especially well liked as an academic topic. This is because OO design has no real mathematical foundation to support it — at least, not until someone comes along and creates a formal model for side effects. The concepts of OOP stem not from mathematics but from fuzzy intuition.

This in some sense explains its popularity, and it also explains why OOP has so many subtly different flavors in practice: whether (and how) to support multiple inheritance, static members, method overloading vs. rich signatures, and so on. Industry folks can never quite agree on what OOP is, but we love it all the same.

Relational Modeling

Relational database modeling is a bit harder and takes more practice, because its strength stems from its mathematical foundation. Relational modeling can be intuitive, depending on the problem domain, but most people would agree that it is not necessarily so: it takes some real skill to learn how to model arbitrary problems as relational schemas.

Object modeling and relational modeling produce very different designs, each with its strengths and weaknesses, and one of the trickiest problems we face in our industry has always been the object-relational mapping (ORM) problem. It's a big deal. Some people may have let you to believe that it's simple, or that it's automatically handled by frameworks such as Rails or Hibernate. Those who know better know just how hard ORM is in real-world production schemas and systems.

XML Modeling

XML provides yet another technique for modeling problems. Usually XML is used to model data, but it can also be used to model code. For instance, XML-based frameworks such as Apache Ant and XSLT offer computational facilities: loops or recursion, conditional expressions and setting variables.

In many domains, programmers will decide on an XML representation before they've thought much about the class model, because for those domains XML actually offers the most convenient way of thinking about the problem.

The kinds of data that work well with XML modeling tend to be poorly suited for relational modeling, and vice-versa, with the practical result that XML/relational mapping is almost as infamously thorny as O/R mapping.

And as for XML/OO mapping, most of us tend to treat it as a more or less solved problem. However, in practice there are several competing ways of doing XML/OO mapping. The W3C DOM and SAX enjoy the broadest use, but they are both sufficiently cumbersome that alternatives such as JDom and REXML (among others) have gained significant followings.

I mention this not to start a fight, but only to illustrate that XML is a third modeling technique in its own right. It has both natural resonances and surfaces of friction with both relational design and OO design, as one might expect.

Other schools

I'm not claiming that these three modeling schools are the only schools out there – far from it! Two other obvious candidates are Functional modeling (in the sense of Functional Programming, with roots in the lambda calculus) and Prolog-style logical modeling. Both are mature problem-modeling strategies, each with its pros and cons, and each having varying degrees of overlap with other strategies. And there are still other schools, perhaps dozens of them.

The important takeaway is that none of these modeling schools is "better" than its peers. Each one can model essentially any problem.

There are tradeoffs involved with each school, by definition — otherwise all but one would have disappeared by now.

Finding the sweet spot

Sometimes it makes sense to use multiple modeling techniques in the same problem space. You might do a mixed XML/relational data design, or a class-based OO design with Functional aspects, or embed a rules engine in a larger system.

Choosing the right technique comes down to convenience. For any given real-world problem, one or two modeling schools are likely to be the most convenient approaches. Exactly which one or two depends entirely on the particulars of the problem.

By convenient, I mean something different from what you might be thinking. To me, a convenient design is one that is convenient for the users of the design. And it should also be convenient to express, in the sense of minimalism: all else being equal, a smaller design beats a big one. One way of looking at this is that the design should be convenient for itself!

Unfortunately, most programmers (myself included) tend to use exactly the wrong definition of convenience: they choose a modeling technique that is convenient for themselves. If they only have experience in one or two schools, guess which techniques they'll jump to for every problem they face?

This problem rears its head throughout computing. There's always a "best" tool for any job, but if programmers don't know how to use it, they'll choose an inferior tool because they think their schedule doesn't permit a learning curve. In the long run they're hurting their schedules, but it's hard to see that when you're down in the trenches.

Modeling schools are just like programming languages, web frameworks, editing environments and many other tools: you won't know how to pick the right one unless you have a reasonably good understanding of all of them, and preferably some practice with each.

The important thing to remember is that all modeling schools are "first class" in the sense of being able to represent any problem, and no modeling school is ideal for every situation. Just because you are most comfortable solving a problem using a particular strategy does not mean that it is the ideal solution to the problem. The best programmers aim to master all available techniques, giving them a better chance at making the right choices.

Property Modeling

With this context in mind, I claim that the Properties Pattern is yet another kind of domain modeling, with its own unique strengths and tradeoffs, distinct from all the other modeling schools I've mentioned. It is their first-class peer, inasmuch as it is capable of modeling the same broad set of problem domains.

After we've finished talking about the Properties Pattern in exhausting detail, I think I'll have convinced you of the pattern's status as a major school of modeling. Hopefully you'll also start to have a feel for the kinds of problems it's well-suited to solve – sometimes more so than other schools, even your current favorite.

But before we dive into technical details, let's take a brief peek at a fascinating comparison of Property-based modeling to class-based OO design. It's a non-technical argument that I think has some real force behind it.

Brains and Thoughts

Douglas Hofstadter has spent a lifetime thinking about the way we think. He's written about it perhaps more than anyone else in the past century. Even if someone out there has beaten him in sheer quantity of words on the subject, nobody has come close to rivaling his style or his impact on programmers everywhere.

All of his books are wonderfully imaginative and are loads of fun to read, but if you're a programmer and you haven't yet read Gödel, Escher, Bach: An Eternal Golden Braid (usually known as "GEB"), then I envy you: you're in for a real treat. Get yourself a copy and settle in for one of the most interesting, maddening, awe-inspiring and just plain fun books ever written. The Pulitzer Prize it won doesn't nearly do it justice. It's one of the greatest and most unique works of imagination of all time.

Hofstadter made a compelling argument in GEB (thirty years ago!) that property-based modeling is fundamental to the way our brains work. In Chapter XI ("Brains and Thoughts"), there are three little sections titled Classes and Instances, The Prototype Principle, and The Splitting-off of Instances from Classes that together form the conceptual underpinnings of the Properties Pattern. In these little discussions Hofstadter explains how the Prototype Principle relates to classic class-based modeling.

I wish I could reproduce his discussion in full here — it's only three pages — but I'll have to just encourage you to go read it instead. His thesis is this:

The most specific event can serve as a general example of a class of events.

Hofstadter offers several supporting examples for this thesis, but I'll paraphrase one of my all-time favorites. It goes more or less as follows.

Imagine you're listening to announcers commenting on an NFL (American football) game. They're talking about a new rookie player that you don't know anything about. At this point, the rookie – let's say his name is L.T. – is just an instance of the class "football player" with no differentiation.

The announcers mention that L.T. is a running back: a bit like Emmitt Smith in that he has great speed and balance, and he's great at finding holes in the defense.

At this point, L.T. is basically an "instance" of (or a clone of) Emmitt Smith: he just inherited all of Emmmitt's properties, at least the ones that you're familiar with.

Then the announcers add that L.T. is also great at catching the ball, so he's sometimes used as a wide receiver. Oh, and he wears a visor. And he runs like Walter Payton. And so on.

As the announcers add distinguishing attributes, L.T. the Rookie gradually takes shape as a particular entity that relies less and less on the parent class of "football player". He's become a very, very specific football player.

Pages: ← previous Ctrl next
1 2 3 4