Monday, 21 May 2007

Of Size, and Governance


If you set out from Langholm, in Eskdale, and drive in a car to Drummore in the Rhinns of Galloway, you will drive 119 miles, and - according to Google's mapping system - it will take you 4 hours and eight minutes. If you didn't fancy Drummore, you could get to Stafford, in Staffordshire, in one minute less; or Dunkeld, in Perthshire, in five minutes less.
 
From Drummore, driving by road (and taking ferries where appropriate), you could get to Dunoon in Argyle or Dunblane in Perthshire quicker than you could get to Langholm. Even with the ferry, getting to Dundalk in the Republic or Ireland would only take 21 minutes longer.
 
So what's amazing or shocking about that?
 
Well, to get from Langholm to Stafford you pass through Dumfries and Galloway, Cumbria, Lancashire, Manchester, Cheshire and finally Staffordshire. Six separate local government units. To get to Dunkeld you pass through Dumfries and Galloway, South Lanarkshire, Glasgow, East Dumbartonshire, Stirling, and Perth and Kinross; again, six local government areas.
 
To get from Drummore to Dunoon, you pass through five separate local government areas. Drummore to Dunblane is eight...
 
But Drummore to Langholm is only one: Dumfries and Galloway all the way. It's simply a perversion of language that a councillor from Langholm overseeing decisions which affect Drummore (or vice versa) is in any sense 'local' government. Dumfries and Galloway, if it were a nation, would be by no means the worlds smallest. At 6500 square kilometres it's larger than Palestine; larger than Brunei; larger than Trinidad and Tobago; more than twice as large as Samoa or Luxembourg; more than six times as large as Hong Kong; more than ten times as large as Singapore or Bahrain; more than 40 times as large as Liechtenstein; more than four thousand times the size of Monaco. In fact quarter of all the nations and self governing territories in the world have a land area smaller than Dumfries and Galloway.
 
Ah, you might say, but we have a sparse population. That's true, of course. Only 43 nations and self governing territories are less populous than Dumfries and Galloway.
 
But that's talking about nations, about independence. We don't have independence and we don't aspire to it. Let's look at how other small northern European countries organise local democracy. Take Iceland, for example. Iceland has a population twice the size of Dumfries and Galloway. It is divided into 'municipalities' which have responsibility for  kindergartens, elementary schools, waste management, social services, public housing, public transportation, services to senior citizens and handicapped people and so on. Not so very different, in fact, from the responsibilities of our local government. So, with a population twice that of Dumfries and Galloway, how many of these municipalities does Iceland have?
 
Two?
 
No.
 
The answer is seventy nine.
 
Iceland is an extreme case, of course; a nation of proud and independent people with an ancient history of democratic organisation, and strong civil society. But Denmark, with a population roughly equal to Scotland's, has three times as many local authorities. Norway, with three quarters our population, has 400 more local authorities than Scotland has.
 
Put it differently: Dumfries and Galloway has three times the population of the average Danish local authority; four times of the average Swedish or Dutch; twelve times the average for Norway; thirty six times the average for Iceland. I said Iceland was an extreme case, didn't I? Get this. Dumfries and Galloway has eighty four times the population of the average - the average - French commune.



Monday, 20 February 2006

Post-scarcity Software

From http://aturingmachine.com/
For years we've said that our computers were Turing equivalent, equivalent to Turing's machine U. That they could compute any function which could be computed. They aren't, of course, and they can't, for one very important reason. U had infinite store, and our machines don't. We have always been store-poor. We've been mill-poor, too: our processors have been slow, running at hundreds, then a few thousands, of cycles per second. We haven't been able to afford the cycles to do any sophisticated munging of our data. What we stored - in the most store intensive format we had - was what we got, and what we delivered to our users. It was a compromise, but a compromise forced on us by the inadequacy of our machines.

The thing is, we've been programming for sixty years now. When I was learning my trade, I worked with a few people who'd worked on Baby - the Manchester Mark One - and even with two people who remembered Turing personally. They were old then, approaching retirement; great software people with great skills to pass on, the last of the first generation programmers. I'm a second generation programmer, and I'm fifty. Most people in software would reckon me too old now to cut code. The people cutting code in the front line now know the name Turing, of course, because they learned about U in their first year classes; but Turing as a person - as someone with a personality, quirks, foibles - is no more real to them than Christopher Columbus or Noah, and, indeed, much less real than Aragorn of the Dunedain.

In the passing generations we've forgotten things. We've forgotten the compromises we've made; we've forgotten the reasons we've made them. We're no longer poor. The machine on which I'm typing this - my personal machine, on my desk, used by no-one but me - has the processor power of slightly over six thousand DEC VAXes; it has the one hundred and sixty two thousand times as much core store as the ICL 1900 mainframe on which I learned Pascal. Yet both the VAX and the 1900 were powerful machines, capable of supporting dozens of users at the same time. Compared to each individual user of the VAX, of the 1900, I am now incalculably rich. Vastly. Incomprehensibly.

And it's not just me. With the exception of those poor souls writing embedded code for micro-controllers, every programmer now working has processor and store available to him which the designers of the languages and operating systems we still use could not even have dreamed of. UNIX was designed when 32 bit machines were new, when 16,384 bytes was a lot of memory and very expensive. VMS - what we now call 'Windows XP' - is only a few years younger.

The compromises of poverty are built into these operating systems, into our programming languages, into our brains as programmers; so deeply ingrained that we've forgotten that they are compromises, we've forgotten why we chose them. Like misers counting grains on the granary floor while outside the new crop is falling from the stalks for want of harvesting, we sit in the middle of great riches and behave as though we were destitute.

One of the things which has made this worse in recent years is the rise of Java, and, following slavishly after it, C#. Java is a language which was designed to write programs for precisely those embedded micro-controllers which are still both store and mill poor. It is a language in which the mind-set of poverty is consciously ingrained. And yet we have adopted it as a general purpose programming language, something for which it is not at all suitable, and in doing so have taught another generation of programmers the mind-set of poverty. Java was at least designed; decisions were made for reasons, and, from the point of view of embedded micro-controllers, those reasons were good. C# is just a fit of pique as software. Not able to 'embrace and extend' Java, Microsoft aped it as closely as was possible without breaching Sun's copyright. Every mistake, every compromise to poverty ingrained in Java is there in C# for all the world to see.

It's time to stop this. Of course we're not as wealthy as Turing. Of course our machines still do not have infinite store. But we now have so much store - and so many processor cycles - that we should stop treating them as finite. We should program as if we were programming for U.

Store, Name and Value

So let's start with what we store, what we compute on: values. For any given column within a table, for every given instance variable in a class, every record, every object is constrained to have a value with a certain format.

This is, of course, historical. Historically, when storage was expensive we stored textual values in fields of fixed width to economise on storage; we still do so largely because that's what we've always done rather than because there's any longer any rational reason to. Historically, when storage and computation were expensive, we stored numbers in twos-complement binary strings in a fixed number of bytes. That's efficient, both of store and of mill.

But it is no longer necessary, nor is it desirable, and good computer languages such as LISP transparently ignores the difference between the storage format of different numbers. For example:

(defun factorial (n)
  (cond 
    ((eq n 1) 1)
    (t (* n (factorial (- n 1))))))

;; a quick way to generate very big numbers...

We can add the value of factorial 100 to an integer, say 2, in just
the same way that we can add any other two numbers:

(+ (fact 100) 2)
933262154439441526816992388562667004907159682643816214685929638952 175999932299156089414639761565182862536979208272237582511852109168 64000000000000000000000002

We can multiply the value of factorial 100 by a real number, say pi, in just the same way as we can add any other two numbers:

(* (factorial 100) pi)
2.931929528260332*10^158

The important point to note here is that there's no explicit call to a bignum library or any other special coding. LISP's arithmetic operators don't care what the underlying storage format of a number is, or rather, are able transparently to handle any of the number storage formats - including bignums - known to the system. There's nothing new about this. LISP has been doing this since the late 1960s. Which is as it should be, and, indeed, as it should be in storage as well as in computation.

A variable or a database field (I'll treat the two as interchangeable, because, as you will see, they are) may reasonably have a validation rule which says that a value which represents the longitude of a point on the Earth in degrees should not contain a value which is greater than 360. That validation rule is domain knowledge, which is a good thing; it allows the system to have some vestige of common sense. The system can then throw an exception when it is asked to store 764 as the longitude of a point, and this is a good thing.

Why then should a database not throw an exception when, for example, a number is too big to fit in the internal representation of a field? To answer, here's a story I heard recently, which seems to be apocryphal, but which neatly illustrates the issue just the same.
The US Internal Revenue Service have to use a non-Microsoft computer to process Bill Gate's income tax, because Microsoft computers have too small an integer representation to represent his annual income.
Twos complement binary integers stored in 32 bits can represent plus or minus 2,147,483,648, slightly over two US billion. So it's easily possible that Bill Gates' income exceeds this. Until recently, Microsoft operating systems ran only on computers with a register size of 32 bits. Worryingly, the default integer size of my favourite database, Postgres, is also 32 bits.

This is just wrong. Nothing in the domain of income places any fixed upper bound on the income a person may receive. Indeed, with inflation, the upper limit on incomes as quantity is likely to continue to rise. Should we patch the present problem by upping the size of the integer to eight bytes?
In Hungary after the end of World War II inflation ran at 4.19 ? 1016 percent per month - prices doubled every 15 hours. Suppose Gates' income in US dollars currently exceeds the size of a thirty two bit integer, it would take at most 465 hours - less than twenty days - to exceed US$9,223,372,036,854,775,808. What's scary is how quickly you'd follow him. If your present annual salary is just thirty three thousand of your local currency units, then given that rate of inflation, you would overflow a sixty-four bit integer in just 720 hours, or less than a month.

Lots of things in perfectly ordinary domains are essentially unbounded. They aren't shorts. They aren't longs. They aren't doubles. They're numbers. And a system asked to store a number should store a number. Failure to store a number because it's size violates some constraint derived from domain knowledge is desirable behaviour; failure to store a number because it size violates the internal storage representation of the system is just bad, outdated, obsolete system design. Yes, it's efficient of compute power on thirty-two bit processors to store values in thirty-two bit representations. Equally, it's efficient of disk space for a database to know in advance just how mush disk it has to reserve for each record in a table, so that to skip to the Nth record it merely has to skip forward (N * record-size) bytes.

But we're no longer short of either processor cycles or disk space. For a database to reject a value because it cannot be stored in a particular internal representation is industrial archaeology. It is a primitive and antiquated workaround from days of hardware scarcity. In these days of post-scarcity computing, it's something we should long have forgotten, long have cast aside.

This isn't to say that integers should never be stored in thirty-two bit twos complement binary strings. Of course they should, when it's convenient to do so. It's a very efficient storage representation. Of course, when a number overflows a thirty two bit cell, the runtime system has got to throw an exception, has got to deal with it, and consequently the programmer who writes the runtime system has still got to know about and understand the murky aspects of internal storage formats.

Perhaps the language designer, and the programmer who writes the language compiler should, too, but personally I don't think so. I think that at the layer in the system - the level of abstraction - at which the compiler writer works, the operator 'plus' should just be a primitive. It takes two numbers, and returns a number. That's all. The details of whether that's a float, a double, a rational or a bignum should not be in the least relevant at the level of language. There is a difference which is important between a real number and an integer. The old statistical joke about the average family having 2.4 children is funny precisely because it violates our domain knowledge. No family has 2.4 children. Some things, including children, are discrete, however indiscreet you may think them. They come in integral quantities. But they don't come in short quantities or long quantities. Shorts and longs, floats and doubles are artefacts of scarcity of store. They're obsolete.

From the point of view of the runtime designer, the difference between a quantity that can be stored in two bytes, or four, or eight must matter. From the point of view of the application designer, the language designer, even the operating system designer, they should disappear. An integer should be an integer, whether it represents the number of toes on your left foot (about 5), the number of stars in the galaxy (about 1x1011) or the number of atoms in the universe (about 1x1079). Similarly, a real number should be just a real number.

This isn't to say we can't do data validation. It isn't to say we can't throw a soft exception - or even a hard one - when a value stored in a variable or field violates some expectation, which may be an expectation about size. But that should be an expectation based on domain knowledge, and domain knowledge alone; it should not be an expectation based on implementation knowledge.

Having ranted now for some time about numbers, do you think I'm finished? I'm not. We store character values in databases in fields of fixed size. How big a field do we allocate for someone's name? Twenty four characters? Thirty-two? We've all done it. And then we've all found a person who violates our previous expectation of the size of a name, and next time we've made the field a little bigger. But by the time we've made a field big enough to store Charles Philip Arthur George Windsor or Sirimavo Ratwatte Dias Bandaranaike we've negated the point of fixed width fields in the first place, which was economy. There is no natural upper bound to the length of a personal name. There is no natural upper bound to the length of a street address. Almost all character data is a representation at some level of things people say, and the human mind doesn't work like that.

Of course, over the past fifty years, we've tried to make the human mind work like that. We've given addresses standardised 'zip codes' and 'postcodes', we've given people standardised 'social security numbers' and 'identity codes'. We've tried to fit natural things into fixed width fields; we've tried to back-port the inadequacies of our technology onto the world. It's stupid, and it's time we stopped.

So how long is a piece of string? How long is a string of characters? It's unbounded. Most names are short, because short names are convenient and memorable. But that does not mean that for any given number of characters, it's impossible that there should be something with a normal name of that length. And names are not the only things we store in character strings. In character strings we store things people say, and people talk a lot.

At this point the C programmers, the Java programmers are looking smug. Our strings, they say, are unbounded. Sorry lads. A C string is a null terminated sequence of bytes. It can in principle be any length. Except that it lives in a malloced lump of heap (how quaint, manually allocating store) and the maximum size of a lump of heap you can malloc is size_t, which may be 231, 232, 263 or 264 depending on the system. Minus one, of course, for the null byte. In Java, similarly, the size of a String is an int, and an int, in Java, means 231.

Interestingly, Paul Graham, in his essay 'The Hundred YearLanguage', suggests doing away with stings altogether, and representing them as lists of characters. This is powerful because strings become S-expressions and can be handled as S-expressions; but strings are inherently one-dimensional and S-expressions are not. So unless you have some definite collating sequence for a branching 'string' it's meaning may be ambiguous. Nevertheless, in principle and depending on the internal representation of a CONS cell, a list of characters can be of indefinite extent, and, while it isn't efficient of storage, it is efficient of allocation and deallocation; to store a list of N characters does not require us to have a contiguous lump of N bytes available on the heap; nor does it require us to shuffle the heap to make a contiguous lump of that size available.

So; to reprise, briefly.

A value is just a value. The internal representation of a value is uninteresting, except to the designer and author of the runtime system - the virtual machine. For programmers at every other level the internal representation of every value is DKDC: don't know, don't care. This is just as true of things which are fundamentally things people say, things which are lists and things which are pools, as it is of numbers. The representation that the user - including the programmer - deals with is the representation which is convenient and comfortable. It does not necessarily have anything to do with the storage representation; the storage representation is something the runtime system deals with, and that the runtime system effectively hides. Operators exposed by the virtual machine are operators on values. It is a fundamental error, a failure of the runtime designer's most basic skill and craft, for a program ever to fail because a value could not be represented in internal representation - unless the store available to the system is utterly exhausted.

Excalibur and the Pool

A variable is a handle in a namespace; it gives a name to a value, so that we can recall it. Storing a value in a variable never causes an exception to be thrown because the value cannot be stored. But it may, reasonably, justifiably, throw an exception because the value violates domain expectations. Furthermore, this exception can be either soft or hard. We might throw a soft exception if someone stored, in a variable representing the age of a person in years, the value 122. We don't expect people to reach one hundred and twenty two years of age. It's reasonable to flag back to whatever tried to set this value that it is out of the expected range. But we should store it, because it's not impossible. If, however, someone tries to store 372 in a variable representing longitude in degrees, we should throw a hard exception and not store it, because that violates not merely a domain expectation but a domain rule.

So a variable is more than just a name. It is a slot: a name with some optional knowledge about what may reasonably be associated with itself. It has some sort of setter method, and possibly a getter method as well.

I've talked about variables, about names and values. Now I'll talk about the most powerful abstraction I use - possibly the most powerful abstraction in software - the namespace. A namespace is a sort of pool into which we can throw arbitrary things, tagging each with a distinct name. When we return to the pool and invoke a name, the thing in the pool to which we gave that name appears.

Regularities: tables, classes, patterns

Database tables, considered as sets of namespaces, have a special property: they are regular. Every namespace which is a record in the same table has the same names. A class in a conventional object oriented language is similar: each object in the class has the same set of named instance variables. They match a pattern: they are in fact constrained to match it, simply by being created in that table or class.
Records in a table, and instance variables in a class, also have another property in common. For any given name of a field or instance variable, the value which each record or object will store under that name is of the same type. If 'Age' is an integer in the definition of the table or class, the Age of every member will be an integer. This property is different from regularity, and, lacking a better word for it, I'll call it homogeneity. A set of spaces which are regular (i.e. share the same names) need not be homogeneous (i.e. share the same value types for those names), but a set which is homogeneous must be regular.

But records in a table, in a view, in a result set are normally in themselves values whose names are the values of the key field. And the tables and views, too, are values in a namespace whose names are the table names, and so on up. Namespaces, like Russian dolls, can be nested indefinitely. By applying names to the nested spaces at each level, we can form a path of names to every space in the meta-space and to each value in each space, provided that the meta-space forms an acyclic directed graph (this is, after all, the basis of the XPath language. Indeed, we can form paths even if the graph has cycles, provided every cycle in the graph has some link back to the root.

Social mobility

It's pretty useful to gather together all objects in the data space which match the same pattern; it's pretty useful for them all to have distinct names. So the general concept of a regularity which is itself a namespace is a useful one, even if the names have to be gensymed.

To be in a class (or table), must a space be created in that class (or table)? I don't see why. One of my earlier projects was an inference engine called Wildwood, in which objects inferred their own class by exploring the taxonomy of classes until they found the one in which they felt most comfortable. I think this is a good model. You ought to be able to give your dataspace a good shake and then pull out of it as a collection all the objects which match any given pattern, and this collection ought to be a namespace. It ought to be so even if the pattern did not previously exist in the data space as the definition of a table or class or regularity or whatever you care to call it.

A consequence of this concept is that objects which acquire new name-value pairs may move out of the regularity in which they were created either to exist as stateless persons in the no-man's land of the dataspace, or into a new regularity; or may form the seed around which a new regularity can grow. An object which acquires a value for one of its names which violates the validation constraints of one homogeneity may similarly move out into no-man's land or into another. In some domains, in some regularities, it may be a hard error to do this (i.e. the system will prevent it). In some domains, in some regularities, it may be a soft error (i.e. the system allows it under protest). In some domains, in some regularities, it may be normal; social mobility of objects will be allowed.

Permeability

There's another feature of namespaces which gets hard wired into lots of software structures without very often being generalised, and that is permeability, semi-translucency. In my toolkit Jacquard, for example, values are first searched for in the namespace of http parameters; if not found there, in the namespace of cookies; next, in the namespace of session variables, then in local configuration parameters, finally in global configuration parameters. There is in effect a layering of semi-translucent namespaces like the veils of a dancer.

It's not a pattern that's novel or unique to Jacquard, of course. But in Jacquard it's hard wired and in all the other contexts in which I've seen this pattern it's hardwired. I'd like to be able to manipulate the veils; to add, or remove, of alter the layering. I'd like this to be a normal thing to be able to do.

The Name of the Rose: normativeness and hegemony

I have a friend called Big Nasty. Not everyone, of course, calls him Big Nasty. His sons call him 'Dad'. His wife calls him 'Norman'. People who don't know him very well call him 'Mr Maxwell'. He does not have one true name.

The concept of a true name is a seductive one. In many of the traditions of magic - and I have always seen software as a technological descendant or even a technological implementation of magic - a being invoked by its true name must obey. In most modern programming languages, things tend to have true names. There is a protocol for naming Java packages which is intended to guarantee that every package written anywhere in the world has a globally unique true name. Globally unique true names do then have utility. It's often important when invoking something to be certain you know exactly what it is you're invoking.

But it does not seem to me that this hegemonistic view of the dataspace is required by my messy conception. Certainly it cannot be true that an object has only one true name, since it may be the value of several names within several spaces (and of course this is true of Java; a class well may have One True Name, but I can still create an instance variable within an object whose name is anythingILike, and have its value is that class).

The dataspace I conceive is a soup. The relationships between regularities are not fixed, and so paths will inevitably shift. And in the dataspace, one sword can be in many pools - or even many times in the same pool, under different names - at the same time. We can shake the dataspace in different ways to see different views on the data. There should be no One True hegemonistic view.

This does raise the question, 'what is a name'. In many modern relational databases, all primary keys are abstract and are numbers, even if natural primary keys exist in the data - simply because it is so easy to create a table with an auto-incrementer on the key field. Easy, quick, convenient, lazy, not always a good thing. In terms of implementation details, namespaces are implemented on top of hash tables, and any data object can be hashed. So can anything be a name?

In principle yes. However, my preference would be to purely arbitrarily say no. My preference would be to say that a name must be a 'thing people say', a pronounceable sequence of characters; and also, with no specific upper bound, reasonably short.

The Problem with Syntax

Let me start by saying that I really don't understand the problem with syntax. Programming language designers spend a lot of time worrying about it, but I believe they're simply missing the point. People say 'I can't learn LISP because I couldn't cope with all the brackets'. People - the Dylan team, for one - have developed systems which put a skin of 'normal' (i.e., ALGOL-like) syntax on top of LISP. I personally won't learn Python because I don't trust a language where white space is significant. But in admitting that prejudice I'm admitting to a mistake which most software people make.

We treat code as if it wasn't data. We treat code as if it were different, special. This is the mistake made by the LISP2 brigade, when they gave their LISPs (ultimately including Common LISP) separate namespaces, one for 'code' and one for 'data'. It's a fundamental mistake, a mistake which fundamentally limits our ability to even think about software.

What do I mean by this?

Suppose I ask my computer to store pi, 3.14159265358979. Do I imagine that somewhere deep within the machine there is a bitmap representation of the characters? No, of course I don't. Do I imagine there's a vector starting with the bytes 50 46 49 51 49 53 57 ...? Well, of course, there might be, but I hope there isn't because it would be horribly inefficient. No, I hope and expect there's an IEEE 754 binary encoding of the form 01100100100001111...10. But actually, frankly, I don't know, and I don't care, provided that it is stored and that it can be computed with.

However, as to what happens if I then ask my computer to show me the value it has stored, I do know and I do care. I expect it to show me the character string '3.14159265358979' (although I will accept a small amount of rounding error, and I might want it to be truncated to a certain number of significant figures). The point is, I expect the computer to reflect the value I have stored back to me in a form which it is convenient for me to read, and, of course, it can.

We don't, however, expect the computer to be able to reflect back an executable for us in a convenient form, and that is in itself a curious thing. If we load, for example, the UNIX command 'ls' into a text editor, we don't see the source code. We see instead, the raw internal format. And the amazing thing is that we tolerate this.

It isn't even that hard to write a 'decompiler' which can take a binary and reflect back source code in a usable form. Here, for example, is a method I wrote:

    /**
     * Return my action: a method, to allow for specialisation. Note: this
     * method was formerly 'getAction()'; it has been renamed to disambiguate
     * it from 'action' in the sense of ActionWidgets, etc.
     */
    public String getNextActionURL( Context context ) throws Exception
    {
        String nextaction = null;

        HttpServletRequest request =
            (HttpServletRequest) context.get( REQUESTMAGICTOKEN );

        if ( request != null )
        {
            StringBuffer myURL = request.getRequestURL(  );

            if ( action == null )
            {
                nextaction = myURL.toString(  );

                // If I have no action, default my action
                // to recall myself
            }
            else
            {
                nextaction =
                    new URL( new URL( myURL.toString(  ) ), action ).toString(  );

                // convert my action into a fully
                // qualified URL in the context of my
                // own
            }
        }
        else
        { // should not happen!
            throw new ServletException( "No request?" );
        }

        return nextaction;
    }

and here is the result of 'decompiling' that method with an
open-source Java decompiler, jreversepro:

    public String getNextActionURL(Context context)
                throws Exception
    {
         Object object = null;
         HttpServletRequest httpservletrequest = 
              (HttpServletRequest)context.get( "servlet_request");
         String string;
         if (httpservletrequest != null) {
              StringBuffer stringbuffer = httpservletrequest.getRequestURL();
              if (action == null)
                   string = stringbuffer.toString();
              else
                   string = new URL(new URL(stringbuffer.toString()) ,
                                    action).toString();
         }
         else
              throw new ServletException("No request?");

         return (string);
    }

As you can see, the comments have been lost and some variable names
have changed, but the code is essentially the same and is perfectly
readable. And this is with an internal form which has not been
designed with decompilation in mind. If decompilation had been designed
for in the first place, the binary could have contained pointers to
the variable names and comments. Historically we haven't done this,
both for 'intellectual property' reasons and because of store
poverty. In future, we can and will.

Again, like so much in software, this isn't actually new. The microcomputer BASICs of the seventies and eighties 'tokenised' the source input by the user. This tokenisation was not of course compilation, but it was analogous to it. The internal form of the program that was stored was much terser then the representation the user typed. But when the user asked to list the program, it was expanded into its original form.

Compilation - even compilation into the language of a virtual machine - is much more sophisticated than tokenising, of course. Optimisation means that many source constructs may map onto one object construct, and even that one source construct may in different circumstances map onto many object constructs. Nevertheless it is not impossible - nor even hugely difficult - to decompile object code back into readable, understandable and editable source.

But Java syntax is merely a format. When I type a date into a computer, say '05-02-2005', and ask it to reflect that date back to me, I expect it to be able to reflect back to me '05-02-2006'. But I expect it to be able to reflect back to an American '02-05-2006', and to either of us 'Sunday 5th February 2006' as well. I don't expect the input format to dictate the output format. I expect the output format to reflect the needs and expectations of the person to whom it is displayed.

To summarise, again.

Code is data. The internal representation of data is Don't Know, Don't Care. The output format of data is not constrained by the input format; it should suit the use to which it is to be put, the person to whom it is to be displayed.

Thus if the person to whom my Java code is reflected back is a LISP programmer, it should be reflected back in idiomatic LISP syntax; if a Python programmer, in idiomatic Python syntax. Let us not, for goodness sake, get hung up about syntax; syntax is frosting on the top. What's important is that the programmer editing the code should edit something which is clearly understandable to him or her.

This has, of course, a corollary. In InterLISP, one didn't edit files 'out of core' with a text editor. One edited the source code of functions as S-expressions, in core, with a structure editor. The canonical form of the function was therefore the S-expression structure, and not the printed representation of it. If a piece of code - a piece of executable binary, or rather, of executable DKDC - can be reflected back to users with a variety of different syntactic frostings, none of these can be canonical. The canonical form of the code, which must be stored in version control systems or their equivalent, is the DKDC itself; and to that extent we do care and do need to know, at least to the extent that we need to know that the surface frosting can again be applied systematically to the recovered content of the archive.

If God does not write LISP

I started my professional life writing LISP on Xerox 1108s and, later, 1186s - Dandelions and Daybreaks, if you prefer names to numbers. When I wanted to multiply two numbers, I multiplied two numbers. I didn't make sure that the result wouldn't overflow some arbitrary store size first. When a function I wrote broke, I edited in its structure in its position on the stack, and continued the computation. I didn't abort the computation, find a source file (source file? How crude and primitive), load it into a text editor, edit the text, save it, check for syntax errors, compile it, load the new binary, and restart the computation. That was more than twenty years ago. It is truly remarkable how software development environments have failed to advance - have actually gone backwards - in that time.

LISP's problem is that it dared to try to behave as though it were a post-scarcity language too soon. The big LISP machines - not just the Xerox machines, the LMI, Symbolics, Ti Explorer machines - were vastly too expensive. My Daybreak had 8Mb of core and 80Mb of disk when PCs usually didn't even have the full 640Kb. They were out-competed by UNIX boxes from Sun and Apollo, which delivered less good software development environments but at a much lower cost. They paid the price for coming too early: they died. And programmers have been paying the price for their failure ever since.

But you only have to look at a fern moss, a frond of bracken, an elm sapling, the water curling over the lip of a waterfall, to know that if God does not write LISP He writes some language so similar to LISP as to make no difference. DNA encodes recursive functions; turbulent fluids move in patterns formed by recursion, whorls within whorls within whorls.

The internal structure, then, of the post scarcity language is rather lisp-like. Don't get hung up on that! Remember that syntax isn't language, that the syntax you see need not be the syntax I see. What I mean by saying the language is lisp-like is that its fundamental operation is recursion, that things can easily be arranged into arbitrary structures, that new types of structure can be created on the fly, that new code (code is just data, after all) can be created and executed on the fly, that there is no primacy of the structures and the code created by the programmer over the structures and code created by the running system; that new code can be loaded and linked seamlessly into a running system at any time. That instead of little discrete programs doing little discrete specialised things in separate data spaces each with its own special internal format and internal structures, the whole data space of all the data available to the machine (including, of course, all the code owned by the machine) exists in a single, complex, messy, powerful pool. That a process doesn't have to make a special arrangement, use a special protocol, to talk to another process or to exchange data with it.

In that pool, the internal storage representation of data objects is DKDC. We neither have nor need to have access to it. It may well change over time without application layer programs even being aware or needing to be aware of the change, certainly without them being recompiled.

The things we can store in the dataspace include:


  • integers of any size
  • reals to any appropriate degree of precision
  • rationals, complex numbers, and other things we might want to compute with
  • dates, times, and other such useful things
  • things people say of any extent, from names to novels
  • lists of any extent, branching or not, circular or not
  • slots associations of names with some setter and, perhaps, getter knowledge which determine what values can be stored under that name
  • namespaces collections, extensible or not, of slots
  • regularities collections of namespaces each of which share identical names
  • homogeneities collections of namespaces each of which share identical slots
  • functions all executable things are 'functions' in a lispy sense. They are applied to arguments and return values. They may or may not have internal expectations as to the value type of those arguments.
  • processes I don't yet have a good feeling for what a post-scarcity process looks like, at top level. It may simply be a thread executing a function; I don't know. I don't know whether there needs to be one specially privileged executive process.


Things which we no longer store - which we no longer store because they no longer have any utility - include


  • shorts, longs, doubles, etc specific internal representation types. You saw that coming.
  • tables, and with them, relational databases and relational database management systems no longer needed because the pool is itself persistent (although achieving the efficiency of data access that mature RDBMS give us may be a challenge).
  • files You didn't see that coming?


Files are the most stupid, arbitrary way to store data. Again, with a persistent data pool, they cease to have any purpose. Post scarcity, there are no files and there is no filesystem. There's no distinction between in core and out of core. Or rather, if there are files and a filesystem, if there is a distinction between in core and out of core, that distinction falls under the doctrine of DKDC: we don't know about it, and we don't care about it. When something in the pool wants to use or refer to another something, then that other something is available in the pool. Whether it was there all along, or whether it was suddenly brought in from somewhere outside by the runtime system, we neither know nor care. If things in the pool which haven't been looked at for a long time are sent to sulk elsewhere by the runtime system that is equally uninteresting. Things which are not referenced at all, of course, may be quietly dropped by the runtime system in the course of normal garbage collection.

One of the things we've overloaded onto the filesystem is security. In core, in modern systems, each process guards its own pool of store jealously, allowing other processes to share data with it only through special channels and protocols, even if the two processes are run by the same user identity with the same privilege. That's ridiculous. Out of core, data is stored in files often with inscrutable internal format, each with its own permissions and access control list.

It doesn't need to be that way. Each primitive data item in core - each integer, each list node, each slot, each namespace - can have its own access control mechanism. Processes, as such, will never 'own' data items, and will certainly never 'own' chunks of store - at the application layer, even the concept of a chunk of store will be invisible. A process can share a data item it has just created simply by setting an appropriate access policy on it, and programmers will be encouraged normally to be as liberal in this sharing as security allows. So the slot Salary of the namespace Simon might be visible only to the user Simon and the role Payroll, but that wouldn't stop anyone else looking at the slot Phone number of the same namespace.

Welcome, then, to post scarcity computing. It may not look much like what you're used to, but if it doesn't it's because you've grown up with scarcity, and even since we left scarcity behind you've been living with software designed by people who grew up with scarcity, who still hoard when there's no need, who don't understand how to use wealth. It's a richer world, a world without arbitrary restrictions. If it looks a lot like Alan Kay (and friends)'s Croquet, that's because Alan Kay has been going down the right path for a long time.

Saturday, 18 December 2004

A Journey to the turning of the year


Have you ever considered how nice it must be to live in Iceland? I mean, apart from the spectacular scenery and the friendly people. Just think, if I lived in Iceland I could have lounged in bed this morning. I could have slept in till the back of eleven, got up, had a leisurely breakfast, cycled round the block, and come home for a well earned bath in free geothermal hot water with the satisfaction of something significant achieved.


Unfortunately I don't.

I mean, the idea of cycling from sunrise until sunset is the sort of thing which sounds like a cool idea in the balmy days of September. It was a cool idea. Indeed, in parts, it was a shockingly cold idea, but I get ahead of myself. Back in September I had the idea of cycling from sunrise to sunset, and if you're going to cycle from sunrise to sunset the sensible time to do it is on the shortest day of the year. OK, so today wasn't quite the shortest day of the year, but let's not sweat the small stuff.

It wasn't my intention to do this on my own. Indeed, having announced it to my club back in September, I sent an email to the club's mailing list last week:

I'm looking for some very, very stupid people.


I'm looking for some very stupid people because, primarily, I'm even more stupid myself: I'm planning to go for a bike ride on Saturday. From the moment the sun comes up, to the moment the sun  sets. That's 8.43 am to 3.41 pm. It is going to be cold. It is going to be tough. It is going to be a long day. If you're really, really stupid, please come with me...


Surprisingly, I had a volunteer. Unsurprisingly, it wasn't for the whole distance. So when I arrived at the appointed meeting place in Castle Douglas at half past eight this morning I wasn't hugely surprised to find nobody there. I waited around for ten minutes in the cold and the rain, and then, knowing no-one else was coming, set off.

It was fairly light and growing lighter fast, which was just as well because within half a mile my headlight fell off and smashed (it was a cheap old one, so no huge loss - I hadn't taken my lumicycles on the grounds of weight). Within two miles the rain had cleared, and I was cycling along at a nice easy pace, crossing the Dee for the first time at Glenlochar. By Laurieston I was warmed up enough to stop and swap my big padded winter gloves for track mits. Down the shores of Woodhall it was a really beautiful morning, and just past Mossdale there was the most superb complete rainbow, spanning the landscape from horizon to horizon. Of course, a rainbow meant another shower, but once again it was light, thin, not very wetting, and soon past. And at fourteen miles out along the shore of Loch Ken I met Chris coming the other way to meet me.

This was a slightly mixed blessing. He was extremely good company, and we enjoyed pleasant conversation, but he was also significantly quicker than me up hill - and, indeed, the steeper the climb the greater the difference. I plead in mitigation that he had sensible hill-climbing gears on his bike, and I, errm, didn't.  But despite the fact that the route took us from below the 50 metre contour to above the 250 metre, the climb is on the whole gradual - with a few short, sharp shocks. At New Galloway we went straight on out by the kirk for the first of those short, sharp shocks, and thence up the west side of the river to the Earlstoun Loch dam for the second. The high hills were white with snow - the whole ridge of the Rhinns of Kells looked properly arctic, and the Cairnsmore of Carsphairn was a great white spike pointed at the sky. And thus to the long, slow, gruelling climb up to Carsphairn. But we reached Carsphairn much earlier than I had expected, and went straight on through, heading for what had been my personal goal - the Green Well of Scotland, allegedly the last place in Britain where pagan religion was openly practised, as late as the eighteenth century. We got there, and were going well, and were still ahead of schedule, so we headed on up towards the watershed. By now there was a little bit of snow down to the roadside - not a lot, but enough to make it bitterly cold. And by about 11:30 we got to the point where we were clicking up onto our big rings as the climb levelled out.

At the Ayrshire border we turned and blasted back down towards Carsphairn. Climbing, we'd had a north wind against us which hadn't felt strong enough to be much nuisance, but now with both wind and gradient helping we made exceedingly good speed, and were down into Carsphairn again about twelve. Carsphairn is not, let's face it, the world's most bustling metropolis, but it does boast a bar with a large sign inscribed 'meals served all day'. The sign lies. Fortunately - and remarkably for a place so small  - Carsphairn also boasts a tiny shop, which sold us rolls and polystyrene beakers of instant soup. We drank these sitting on a bench at the roadside; but we didn't sit for long, because if cycling in these conditions was cold, just sitting was colder.

Heading south we took the Moniaive road down the East bank of the river. The weather was getting decidedly colder, and Chris stopped to put his warmer gloves on. This struck me as a good idea, and I put mine on, too. Shortly we came to the junction where the Dalry road splits off, and Chris had planned to go home down this. I had sort of planned to cross the watershed down to Lochinvar and thus down the Urr, but neither of us were particularly keen to be cycling alone on those lonely upland roads, so I turned right with Chris.

And within a couple of miles we got a sharp lesson on why it's not clever to cycle them alone. The High Bridge of Ken is a narrow stone bridge, about three metres wide between its high stone parapets, and about fifty metres long. It sits at the bottom of a steep-sided east-west glen, with a sharp turn onto it and a sharp turn off it. Steep sided glen, high parapets, very cold day: you're ahead of me, aren't you? At the same time, face with a nice swoopy descent onto the bridge and a nice tight turn off it, what would you have done?

It was just as I cranked the bike over into the turn off the bridge at about twenty five miles an hour that both tyres let go, suddenly, together, and I had that awful moment of knowing.

Oh, shit, this is going to hurt - a lot.

Curiously, it helped that I was cranking into the bend. The back wheel tried to overtake the front, spinning the bike around to about 45 degrees to its direction of travel, and long after I thought I was at the point of no return got enough grip to bring me back towards upright. I steered into the skid and got the bike under control again, but for the next several miles I felt decidedly shaky and took it a lot slower. Which was a shame because we were dropping down through a series of deliciously swoopy back roads towards Ealstoun.

On one of these - which would have been a stiffish climb the other way - Chris stopped to show me a little roadside memorial, nicely kept with flowers:

'In memory of Johnny Stirling, who died here while cycling   in Bonny Galloway'


Looking at the hill, one could see how one might; but looking out over the glen with the lochs in the bottom and the high snow covered hills on the far side, it felt as though it would not be a bad way - or a bad place - to go.

And thus down to Earlstoun, and into St John's Town of Dalry, and to Chris's house, where I stopped for coffee. I left at three o'clock, and considered my onward route. It's 14 miles down the A713 into Castle Douglas, and I had been averaging 12 miles per hour. I was due to finish at 3:41. Chris advised me against riding down the A713 on the basis that it's busy; but busy is relative and busy by Galloway standards is not busy as understood elsewhere, and by Galloway standards the A 713 down Loch Ken is relatively flat. Also, I had come up the west side of Loch Ken, so going back down the west side didn't feel particularly interesting. So I started off down the A713 thinking I might cut across the watershed into the Urr valley later. However, when I reached the junction at Balmaclellan, it said Corsock 9 miles, and I knew those were nine pretty hilly miles. I didn't feel like it. I cycled on down Loch Ken, past the sailing centre, past the viaduct, through Parton, down through Crossmichael.

By now I was into the home stretch, with only a few miles to go. But I was also feeling it. There were a couple of little detours I could make to add a few miles to the route and get me closer to the magic 3:41, but I didn't take them partly because my legs didn't want to and partly because, as my speed was dropping off, it was beginning to look as if I wouldn't need to. At some point - way later than I should have - I realised I was just running out of blood sugar to burn, and stopped to switch on my lights and get a cereal bar out of my bag.

There's a state you get into (or at least, I get into) where you are just cycling, not doing anything else. I remember watching the trip click up to 63.59 miles, expecting it to change to 64.00 and being completely bewildered when it instead went to 63.60 and then to 63.61; I was so chilled and tired I was confusing miles with minutes. But miles and minutes both rolled on and very soon I was passing under the bypass, onto urban streets, track-standing in the congested traffic of King Street as motorists jostled for parking spaces, getting off the bike stumbling tired and practically staggering into the bike shop, to be greeted with hot sweet tea and a compulsory mince pie and scone. Which were most welcome.

OK, so I finished all of eight minutes early. So sue me. Total distance, just over 65 miles by my computer, or, in morale-boosting metric speak, 104 Km. Total time actually cycling, about six hours. And, despite my whingeing, I enjoyed it, and I'm glad I did it.

And thus back home to a bath with water heated with very expensive oil. I suppose I'd better get on the phone to the Icelandic consulate and talk to them about emigration...

Thursday, 16 December 2004

Let's hear it for the Mullwarchar!


Radio 4's 'Today' programme has been asking for nominations for a 'listeners peer', and I've been listening with half an ear to the suggestions. And what I've been hearing is more of the same old same old; the soi disant great and good, and, more particularly, the metropolitan great and good. So I thought I'd make a nomination completely outside the London box.


The Mullwarchar, admittedly, doesn't say a lot. The Mullwarchar is notoriously neither clubbable nor friendly; not a particularly sociable being. But the Mullwarchar has made a great contribution to our public life, taking a leading role in the campaign against the dumping of nuclear materials and a number of other environmental campaigns. The Mullwarchar has also made a significant contribution to leisure activities and to appreciation of wilderness, and thus to the spiritual life of the nation.

But the most important reason for nominating the Mullwarchar is this: this mountain will not come to Mahomet. The House of Lords is comprised entirely of urban people, of people not merely prepared but happy to spend their working lives in the most crowded, the  most polluted, the most unpleasant place on the island of Britain. Such people are by definition abnormal and unrepresentative.

It would do our parliamentarians good once a year to go to the mountain: to lift up their collective eyes to the hills, to be in a place where man and all his works are utterly insignificant. To get some sense of scale.

And perhaps, on their way into the wilderness and on their way out again, they would have the opportunity to pass through places where the people of Britin - the people they make the laws for - actually live.

So let's hear it for the Mullwarchar: certainly the most noble, unquestionably the most ancient, without doubt the most puissant lord ever nominated to the House. And very probably the wisest.

Let's hear it for the Mullwarchar!

Radio 4's 'Today' programme has been asking for nominations for a 'listeners peer', and I've been listening with half an ear to the suggestions. And what I've been hearing is more of the same old same old; the soi disant great and good, and, more particularly, the metropolitan great and good. So I thought I'd make a nomination completely outside the London box.

The Mullwarchar, admittedly, doesn't say a lot. The Mullwarchar is notoriously neither clubbable nor friendly; not a particularly sociable being. But the Mullwarchar has made a great contribution to our public life, taking a leading role in the campaign against the dumping of nuclear materials and a number of other environmental campaigns. The Mullwarchar has also made a significant contribution to leisure activities and to appreciation of wilderness, and thus to the spiritual life of the nation.

But the most important reason for nominating the Mullwarchar is this: this mountain will not come to Mahomet. The House of Lords is comprised entirely of urban people, of people not merely prepared but happy to spend their working lives in the most crowded, the most polluted, the most unpleasant place on the island of Britain. Such people are by definition abnormal and unrepresentative.
It would do our parliamentarians good once a year to go to the mountain: to lift up their collective eyes to the hills, to be in a place where man and all his works are utterly insignificant. To get some sense of scale.

And perhaps, on their way into the wilderness and on their way out again, they would have the opportunity to pass through places where the people of Britin - the people they make the laws for - actually live.

So let's hear it for the Mullwarchar: certainly the most noble, unquestionably the most ancient, without doubt the most puissant lord ever nominated to the House. And very probably the wisest.

Monday, 13 December 2004

Spectacle and courage


In trying to write a concise review of the extended edition of Peter Jackson's adaptation of The Return of the King, one is faced with three different topics each worthy of consideration. The first is this cut of The Return of the King as a movie; the second is the package with its appendices; the third is the total achievement of the whole project, which this set completes. It's going to be very hard to do justice to all three in just a thousand words.


The Movie



So firstly: The Return of the King, or more precisely this cut, as a movie. Consistently Peter Jackson's extended cuts have been, in my opinion, better movies as movies than the 'theatrical' cuts. There's a lot of new material here - not just extending scenes, but many scenes which were left out of the theatrical cut altogether, which add to characterisation, pacing and story telling.

So: the movie. It does not, of course, religiously follow Tolkien's text - nor could it. On the whole, however, it is reasonably true to the overall themes of Tolkien's text. The story-telling here is fine, and is worked on with great care. The acting, too, is fine. Among so many very fine performances, in this movie I particularly admired Billy Boyd's Pippin, Miranda Otto's Eowyn, Bernard Hill's Theoden. This is, however, very much an ensemble production. The general level of acting is high. People put their all into making this.

And not just into the acting. The costumes are spectacularly gorgeous, the sets spectacular and very largely believable, the scenery very much in keeping. In particular the presentation of the city of Minas Tirith is a tour de force, achieved by actually building quite a substantial part of the city at full scale.

But not all of that you see is real. What is particularly impressive in the CGI in this film (and there's a great deal of it) is the extent to which one simply does not notice it. Gollum, for example, is just there. The fell beasts which the Nazgul ride, and the 'great beasts' which draw Grond, are similarly so seamlessly in the piece that it is hard to believe they weren't there on the set when the camera rolled. With a critical eye you can see the CGI work in the great horse charge, and when the Rohirrim fight the Haradrim on their mumakil - but it isn't sufficiently obvious to be distracting. Indeed the one location in this book which seemed to me 'obviously' CGI - the Hall of Denethor, which seemd to me to have that hyper-reality that comes of ray-tracing - turned out to be a real (but beautifully constructed) set.

Finally, the score and sound design are again excellent.

In summary, this is a beautiful looking movie, telling a classic story and telling it well.

The package



Then the package. The Extended Edition pack comes with two disks of 'appendices', just as the extended editions of The Fellowship of the Ring and The Two Towers did; and they follow very much in the format already established in the earlier appendices, a series of documentary pieces about the background to the story and the making of the film. They don't strike me with the force that the earlier appendices did, but that is not, I think, because these are less good, simply because the format has been established and has lost its freshness. The fact remains that this is not space-filler material; for me, the 'appendices' disks of the Lord of the Rings extended editions set the standards by which all other DVD extra content is judged.

And in this case, you don't just get four disks, you get five. The fifth is about turning the film score into a symphony. Frankly, for me, that was less value for money; it didn't really work either as documentary (too much of it was simply the music) or as music (too often interrupted with commentary). But seeing it's a thrown in extra I wasn't disappointed.

The achievment



So, finally, the whole achievement. The scale and ambition of this project are staggering. Tolkien justifiably thought the Lord of the Rings unfilmable; Jackson has filmed the unfilmable and done it well. I don't quite think it's a masterpiece, but it is a very fine work of craftsmanship, with a coherent vision which produces a believable world.

Why not a masterpiece? Well, some aspects of the plot were clumsily handled. Jackson never really knew what to do with the character of Arwen, for example; and a number of the plot decisions in The Two Towers particularly just don't seem to make any sense (why drop the Grey Company and then import a whole bunch of Lothlorien elves? Why?). Part of this, of course, is a consequence of the need to cut the story into three chunks in order to be manageably marketable. I suspect that one of these days someone - perhaps even Jackson - will reshape this material into a single twelve hour of more movie which will correct some of the plot difficulties. But even so it will be flawed, because the plot really wallows around the problem of Arwen.

Finally, there are too many ham bits of movie cliche. I'd be the first to admit that Tolkien himself it rather given to having things that had lasted millenia destroyed as the fellowship passes through. You can forgive Jackson the collapse of the bridge of Khazad Dum, with Gandalf literally doing a cliff-hanger off the end. It's in the book. But to then repeat the same hammy cliche with Frodo dangling over the abyss in Sammath Naur is unforgivable. And why - why? - does the floor of the causeway in Sammath Naur collapse just behind the running feet of our heroes? Because that's the way it's been done in every hammy adventure film you've ever seen, and Jackson is too in much love with the B movie genre to rise above it.

And yet... what one remembers above all is spectacle and courage. The halls of Khazad Dum; the Argonath; Boromir's last fight on the slopes of Amon Hen; Edoras with its Golden Hall; the thunderous might of the Uruk Hai before Helm's Deep; the charge of the Mumakil; Eowyn standing alone against the Witch King of Angmar. What one remembers, despite the minor flaws, is a great piece of story-telling, telling a great story about friendship and courage.

Tuesday, 23 November 2004

A lightweight 100% Java RDBMS


Introduction



IBM have a 100% pure Java relational database management system which  has been called at various stages in its history SQL/J, Cloudscape and Derby. IBM are now eagerly pushing the system to open source developers under the 'Cloudscape' label. I downloaded it to evaluate for use with PRES and other Jacquard applications.

License



I'm used to using (and creating) things which are open source. IBM claims Cloudscape is now 'open source', but if so it's some bizarre new definition of open source which is opaque to me. If you download Cloudscape from IBM you in fact have to click through (and it comes with) a software license file which looks as intimidating and onerous as any conventional software license. In fact what is going on here is that IBM have given a snapshot of the Cloudscape codebase to the Apache foundation, from which you may download it here. The Apache license is much more straightforward and less onerous than the IBM one.  The version of Cloudbase you can get from IBM appears to be based on the Apache version, but if you download from Apache you don't get the nice installer. To avoid confusion, I shall refer to the RDBMS throughout this review as 'Cloudscape'. I did not, this morning, find any significant difference in use between the IBM ('Cloudscape') and the Apache ('Derby') versions of the system.

First impressions



You can download Cloudscape from IBM in three different packages: a Linux installer which is huge and includes IBM's Java 1.4.2 for Linux; a Windows version which is similarly huge and includes IBM's Java 1.4.2 for Windows; and a 100% pure Java installer, which is sensibly small (9Mb) and sensibly assumes you wouldn't be interested if you didn't already have a JVM. This was the version I tried.

The pure Java installer (InstallShield) worked very nicely on Linux, offering sensible defaults. By contrast to so many open source projects, it looked very polished. Similarly, the PDF documentation looked very polished, very IBM. However - and this is a common gripe - the page numbering in the PDF was off, because the topmatter of the paper document uses a different numbering schema to the body and this different schema is not reflected in the PDF. So, for example, page 132 in the PDF maps onto page 120 of the document, which makes consulting the index or table of contents pretty frustrating. Hey, IBM, this is a small point but very easy to get right. Also of course you can't search the PDFs. What on earth is the point of distributing documentation in a digital format if it can't be searched? And a final gripe on documentation; the documentation index page has a link to online documentation, which I followed in the hope it would lead to searchable documentation. Unfortunately that was '404 not found'. And that, IBM, is simply incompetent.

Fortunately the documentation is available online at Apache: http://incubator.apache.org/derby/manuals/.

Following the instructions in the documentation, I then tried to start the Cloudscape executive, a program called 'ij'. The startup scripts had been automatically created and set up for me with the paths I had chosen for the installation.

But they didn't work.

Well, OK, that needs a bit of amplification. AIX, IBM's own UNIX, uses as its default shell the Korn shell, ksh. Debian Linux, which I use, uses as its default shell the Bourne Again shell, bash. Generally the syntax used by the two shells is so similar that that isn't a problem, but when I tried to invoke the ij script I got a class not found exception:

-[simon]-> /opt/ibm/Cloudscape_10.0/frameworks/NetworkServer/bin/ij.ksh
java.lang.ClassNotFoundException: com.ibm.db2.jcc.DB2Driver

Bizarrely, when I manually executed each of the commands in the scripts in turn, the ij executive started without problem. Clearly there is something in the scripts that bash does not like, but I haven't yet investigated what.

Features



Because of problems with the documentation discussed above, I can't be very definite about missing features; the features I sought may be present but I simply failed to find them in the documentation.

Users, groups and roles



Cloudscape clearly has the concept of a 'user', since it's possible to request the value of the current user; however you don't seem to be able to grant privileges to users, nor to revoke them:

ij> create user simon with password 'xyzzy';
ERROR 42X01: Syntax error: Encountered "user" at line 1, column 8.
ij> grant select on foo to app;
ERROR 42X01: Syntax error: Encountered "grant" at line 1, column 1.

You can pass in a username token and a password in the database URL. User validation is not performed by cloudscape, but cloudscape can be configured to co-operate with external validators. In practice, all using a different username appears to do is to select a different default schema.

The system appears to have no concept of a group or role.

Views



Cloudscape has views but not, it appears, view ordering:

ij> create view froboz as select ban from foo;
0 rows inserted/updated/deleted
ij> select * from froboz;
BAN
------------
froboz

ij> drop view froboz;
0 rows inserted/updated/deleted
ij> create view froboz as select ban from foo order by ban;
ERROR 42X01: Syntax error: Encountered "order" at line 1, column 43.
ij> select ban from foo order by ban;
BAN
------------
froboz

1 row selected

Constraints and Integrity



Cloudscape appears to have a remarkably full constraint syntax. I haven't verified that the constraints actually work. Provided they do, we can work with these data constraints

ij> alter table word
        add constraint word_head foreign key (head)
        references word
        on delete set null;
0 rows inserted/updated/deleted

Datatypes



There appears to be no BOOLEAN data type or equivalent, but we can work round this using CHAR(1) and the values 't' and 'f'; there is no MEMO or TEXT datatype, but there is a CLOB. There is a full set of DATE, TIME and TIMESTAMP datatypes; date format is 'yyyy-mm-dd'.

Conclusions



Cloudscape's big weakness from my point of view was security. There appears no way of setting different access permissions for different users. This means that all security must be in the application layer. Generally Jacquard applications are not built that way; instead, they're built on a database layer security model. Of course, security isn't always critical, and for many users of a PRES system, for example, HTTP authentication of the admin directory would be sufficient.

On the positive side, the system is very easy to install, reasonably easy to set up, and consumes relatively little in the way of machine resources.

The IBM version (Cloudscape) offered no benefit over the Apache version (Derby). Although Cloudscape comes with a slick and polished installer, what it installed did not actually work out-of-the-box; the documentation was in an inconvenient format which was hard to work with and the license terms were onerous. By contrast the Apache version (Derby) was a smaller download, in practice just as easy to set up and get running, and the Apache documentation although apparently based on the same source was constructed in HTML and much easier to use.

There appeared to be little functional difference between the two versions.

Saturday, 20 November 2004

Using, not losing, your head




Cycle helmets are a good thing, aren't they? It's obvious. They protect your head. They must be a good thing: it's common sense. Why then is the cycling community, in the face of proposed mandatory helmet legislation, fighting internecine helmet wars?

Don't panic



Before going into the details of this argument, let's start by putting this into perspective. Cycling is actually a very safe activity. Nothing, of course, is absolutely safe. Last year, in Britain, 114 cyclists were killed. Of those, 95 (83%) died as a result of collisions with motor vehicles. But that's out of millions of cyclists, covering billions of miles. In fact, according to the National Statistics Office, there is on average one fatal accident for every twenty one and a half million miles cycled. Twenty one and a half million. If you were to cycle ten miles every single day, it would be nearly six thousand years before you had a fatal accident.

At the same time as those 114 cyclists died, over three thousand people died from accidents and mishaps in their own homes. Do you think your home is a dangerous place to be?


Of course, in the modern world, there are dangers other than accidents. We live highly stressed lives in which opportunities for exercise get fewer and fewer, and opportunities to eat and drink become more and more available. We get fat. We get unfit. And our health suffers in consequence, with the incidence of illnesses such as obesity, heart disease, osteoporosis and diabetes increasing rapidly. Cycling is a good general exercise both for the cardiovascular system and for the limbs. Unlike walking, jogging or running, the movement is smooth and so does not cause impact damage to the ankles, knees and hips. Yes, there is a finite risk of accident when cycling but it is nevertheless undoubted that if you cycle regularly not only are you likely to live longer but you're more likely to enjoy a fit, active and healthy old age.

Got that? Good. Now let's talk about helmets.

Use no hooks: or, A box for a computer



In the more tragic and more bloody wars of the Democratic Republic Congo, many warriors wear or carry lucky charms which they believe will protect them against bullets. We sophisticated westerners read stories of this and we think 'how quaint, and sad, and ignorant, are these uneducated child soldiers going into battle, believing superstitiously in the protection of lucky charms'. And then we cycle off into the traffic, wearing our cycle helmets.

This note was written as a web page. If you're reading it on a web page, you're reading it on a computer. I'd like you to stop for a moment and think about that computer. When it arrived from its maker - possibly when you bought it - it was packed in a strong cardboard box. Inside the strong cardboard box was almost certainly some polystyrene foam packaging material. Probably at least 40mm of it, surrounding and protecting your computer from the inevitable bumps it would incur in transit - bumps like being dropped from someone's hands onto the warehouse floor, or thumped up against another, similarly packaged computer.

By and large, for these sort of bumps, the packaging works, and your computer probably arrived home safe and sound.

Now think about your bicycle helmet. Like the packaging your computer came in, it is worn to protect a very valuable object - your brain. Like the packaging your computer came in, it is made of polystyrene foam - and typically it's a good bit less than 40mm thick.

Putting the boot in



I would like you to stop again, and think about the box your computer came in. I'd like you, as a thought experiment, to imagine taking your computer, putting it back it in its original box, and placing the box in the middle of the street. Now I want you to imagine getting into a car and driving into the box at just thirty miles an hour. You've imagined that? Good. Now do you think you would be able to use the computer afterwards?

Polystyrene foam is just polystyrene foam. Polystyrene foam is a light, weak, compressible solid which rapidly becomes brittle with age and is easily damaged by solvents. It doesn't become magically stronger just because it's formed into a cycle helmet. The same foam that didn't protect the computer in the thought experiment is equally not going to protect your head in similar circumstances.

Ticking the box



Nor do the manufacturers, nor the standards writers, believe it should. The European test for cycle helmets involves dropping the helmet, containing a dummy head weighing not more than 6Kg, onto a flat surface from a height of 1.5 metres. I don't know about you, but I'm 1.88 metres tall and I weigh 82Kg. If I just fall over from standing upright, I already exceed the impact which cycle helmets sold in Europe are tested to protect against - and exceed it by a very substantial margin. And that's before I've even got on my bicycle and started moving.

In practice, cycle helmets are expected to be helpful in accidents up to about 15mph (24Km/h). You might (common sense) expect a 30mph impact to be only twice as bad as a 15mph impact, and you might think that something which offered reasonable protection at 15mph would offer some degree of protection at 30mph. Unfortunately, it doesn't work like that. Firstly, the force of the impact scales with the square of the speed, so your 30mph impact is four times, not twice, as severe as your 15mph one. But secondly, and even more scarily, it is widely accepted that the probability of injury scales with the fourth power of the speed. So your 30mph impact is sixteen times as likely to cause injury than your 15mph impact.

And that's before you consider what happens to polystyrene foam when its design load is exceeded. It snaps. It suffers 'brittle failure'. You can do this experiment quite easily with the foam packing your computer came in. Take a piece of the foam about as long as your helmet, and about as thick as your helmet. Try to crush it between your finger and thumb. It's surprisingly strong, isn't it? You can squeeze it very hard and it doesn't deform a lot. Polystyrene foam is quite strong in compression, that's why it is used. Now take your piece of foam and snap it between your two hands. That's amazingly easy, isn't it? It takes far less force than crushing it does... which means it has absorbed far less force. When a helmet breaks, it offers no further protection. The more an impact exceeds the helmet's design parameters, the more likely it is to break, and the less likely it is to offer any protection.

You saw the whole of the moon



But let's step back a bit. Let's suppose, for the moment, a helmet provides 100% protection for the part of the body it covers. Because, let's face it, the part of the body a cycle helmet protects is the scalp. What happens to the rest of the body in a 30mph, or in a 60mph accident? Is it really going to be much comfort to your grieving relatives to learn that your hair-do survived OK? Do you believe that because your scalp is protected, your neck and your chest will be protected, too? Or, if not, that your magically preserved brain can be magically plugged into a new heart and lungs? Of course you don't. And of course you know that an impact which has enough force to do severe damage to your skull is likely to do severe damage to other vital systems too. In thinking about protection it is no use protecting one part. It's not enough to see the crescent: you have to look at the whole of the moon.

He's dead, Jim



But it's worse than that. Not only do helmets not provide adequate protection in road speed accidents: they may actually make things worse. In fact they must do so, because in whole populations, as helmet wearing rises, so does the rate of cyclist deaths. Yes, you read that right: the more cyclists wear helmets, the more get killed.
 I don't know why. No-one knows why. Two main mechanisms have been suggested: 'risk compensation', the willingness of people to do more risky things when they believe themselves protected, and rotational injury.

The fact that people do do riskier things when they think they're protected is to some extent obvious. Indeed, Bell cycle helmets have been sold with the slogan 'Courage for the Head'. Could cyclists really be using up all of the safety benefit that helmets provide by taking more risks? It's possible. Could drivers, thinking helmeted cyclists are protected, take more risks around them? That's possible too.

But the more worrying possibility is this: wearing a helmet makes your head bigger. It increases the diameter by about 50%, which means it increases the area by about 125%. Now, is it easier to hit a target if it's more than twice as big? You bet it is. Your head being effectively bigger means that it's more likely to get hit; but again it may be still worse than this. Because we have evolved over millions of years of falling to tuck our heads in. We have reflexes which know - without our thinking about it - just how far we need to tuck our heads in to avoid an impact. By making the head bigger we may possibly defeat that instinctive protection mechanism. And it gets worse: larger diameter means more leverage, more angular acceleration. It has been suggested - and so far this is no more than a suggestion - that helmet wearing may increase rotational injuries to the brain. Rotational acceleration tears brain tissue and causes much more severe brain damage than linear accelerations of the same magnitude.

So not only do helmets make you (very slightly) more likely to get injured; they may also - but this is not proven - significantly increase your risk of the most frightening sort of injury, brain damage.

So: it's junk, then?



Does all this mean you shouldn't wear a helmet? Not in my opinion, no. I have a helmet, a MET Parachute, and I do wear it. When I think it will do some good.

Accidents of the sort cycle helmets won't help with - high speed impacts with something solid - are, fortunately, incredibly rare. When people fall off bicycles, they mostly do so at low speed and very often on tricky, off-road tracks. On tricky, off-road tracks you're very rarely travelling at very high speed and what you hit usually isn't moving at all. Indeed, you mostly fall off in the trickiest sections (or at least, I do) and that's when you're going slowest. Of course, such a fall is unlikely to kill you, but it can leave you with nasty bruising, grazes or even concussion. And a cycle helmet will protect you from bruising and grazing on the part of the body it covers, and may help a bit with concussion, too. So I wear my helmet when I'm doing tricky off-road stuff, particularly if I haven't ridden the particular route before. I should say here that although I've fallen off mountain bikes by now literally thousands of times, I've never hit my head at all - it's almost always my hips and elbows that get it. So even on a mountain bike a helmet isn't essential, and I often don't wear one.

And I don't wear one on the road. Ever. There really isn't any point. I haven't fallen off a bike on the road since I was sixteen, and that's thirty-three years ago. I'm an experienced road rider, and I ride with good awareness of traffic; I know how to protect myself from many of the ways motorists can kill you. Of course I can't protect myself against a motorist who is driving too fast and genuinely doesn't see me, but in that case I do not believe a helmet offers any useful protection. Indeed, on the basis of the available statistics and the simple physics I've described above, I know it cannot.

So cycle helmets are not junk. They are genuinely useful under some circumstances. But pretending they can save your life in traffic accidents is at best mistaken and at worst dishonest. To be fair, helmet makers do not pretend this; but there are still ignorant or misguided people who do - indeed, the opinion that it isn't safe to cycle on the road without one is very common. This common misapprehension is what leads to occasional campaigns for the wearing
of cycle helmets to be made compulsory, by law. It's to counter this misapprehension that I've written this article.

Thursday, 18 November 2004

Lies, damned lies, and cycle helmets


I've just been moved to write to the British Medical Association, a thing which doesn't often happen. The BMA had a critical role to play in the recent campaign to make cycle helmets compulsory in the United Kingdom; they have long had a well thought out policy on cycle helmets - on the whole favouring them, but aware of the ambiguous nature of the evidence in favour of them and siding against compulsion. Their position helped persuade MPs not to vote for compulsion. It seems the pro-compulsionists have seen the BMA as a key target to convert, and recent press releases have announced a policy change, apparently by fiat at the top. The papers the BMA have published in support of their new policies are masterpieces of dishonesty and sloppy thinking. So here is my first, brief, critique, as expressed in an email to parliamentaryunit@bma.org.uk, the address they cite for comments.



My attention has been drawn to your web pages published at
<URL:http://www.bma.org.uk/ap.nsf/Content/Cyclhelmet> and <URL:http://www.bma.org.uk/ap.nsf/Content/Cyclehealth>.

In the first you quote: "Each year over 50 people aged 15 years and under are killed by cycling  accidents, with 70-80 per cent of these resulting from traumatic brain  injury."

As I'm sure you are well aware, the figure recorded for the UK for 2002 as a whole is nineteen deaths, of which only ten involved head injury[1], so the figure you quote is a gross exaggeration. Indeed in no year in the past decade have 50 children died in the UK in cycling accidents, so you cannot even pretend that this figure is historically correct.

In the second you start with the statement: "Action should be taken to both reduce the high rate of fatal and  serious accidents suffered by cyclists..."

In fact, there is no 'high rate of fatal and serious accidents suffered by cyclists'. The fatal accident rate for cyclists is only 75% as high as that for pedestrians (29.5 per billion kilometers as opposed to 44.8 per billion kilometers), and less than a third of that for motorcyclists who do have to wear helmets[2]. Cycling is not only safer than walking, it is getting safer faster, with a steady and healthy downward trend in casualties[3].

Finally, of 114 cyclists of all ages killed in 2003, 61 were involved in collisions with cars, while 25 were involved in collisions with heavy goods vehicles; in total 95 deaths resulted from collisions with motor vehicles.[4] No-one pretends that a cycle helmet would make any useful difference in accidents of this kind.

In summary, these two documents taken together represent irresponsible scaremongering, composed of phoney data completely at variance with the facts. Scaremongering has the inevitable effect of reducing cycling, and reducing cycling has been shown to increase the risk per cyclist. So not only are these papers dishonest in their content, they are also misguided and counter productive in their intent. By reducing the number of people cycling the BMA will not only increase the number of people dying through illnesses related to obesity and lack of exercise, it will also increase the risk of injury and death to people who do cycle.

I am horrified that the BMA should express views on a public policy matter on the basis of such shoddy and dishonest research and without, I understand, bothering to consult its members.

Yours sincerely

Simon Brooke

Wednesday, 3 November 2004

This United Satrapy


Sometimes some things make one more angry than it is easy to express. This morning I am faced with one of these.


The issue



First, a bit of background. There is an organisation called 'indymedia'; it is a journalists collective, which reports stories not generally covered by the mainstream press, specifically including reporting on the demonstrations at G8 summits and such things. On October 7th this year, officers of the United States of America's Federal Bureau of Investigations, acting on behalf of the Italian Government, entered RackSpace's supposedly secure colocation facility in London and removed two servers belonging to indymedia.

What?

Yes, just as I say. The servers have been returned, but that is rather beside the point; and in any case, who is to say what was copied off them (or loaded onto them) in the mean time?

An exchange of notes



So on the 14th October I wrote the following email to my MP:


On Thursday of last week, two computers belonging to an organisation called 'Indymedia' were removed from the premises of a London ISP, Rackspace, apparently by
 the United States Federal Bureau of Investigation, allegedly following a request by the Swiss government. Further detail of this action may be found here:
<URL:http://news.bbc.co.uk/1/hi/technology/3732718.stm>

I should be grateful if you could ask the Home Secretary:

  1. On what legal theory was it proper for the agents of one foreign power, whether or not acting at the behest of another foreign power, to seize property within the United Kingdom?


  • What UK court, or other UK legal authority, authorised this seizure?


  • If it is the case that the seizure was made under the 'Mutual Legal Assistance Treaty', what terrorist information was supposed to have been held on these computers?


  • What evidence of such supposed terrorist information was supplied to the UK authorities in order to justify this seizure?

  • What action is he taking to prevent such seizures or property by agents of foreign powers in future?


  • This action cuts to the very heart of civil society in Britain: to the right of
    free speech, of citizens to publish news and opinion. Without this, democratic governance is impossible. For foreign powers to thus interfere in the democratic
    process in the United Kingdom is utterly intolerable, and wholly undermines the theory of a sovereign UK government.



    My MP duly forwarded this to the Home Office and this morning I received via him a response from Caroline Flint MP, Parliamentary Under Secretary of State at the Home Office, doubtless dictated with a tongue still brown from licking American arses. I shall quote it in full:


    Thank you for your letter dated 18 October 2004  addressed to the Home Secretary, stating concerns expressed by one of your constituents regarding Indymedia. I have been asked to reply as the Minister responsible for international crime.

    Unfortunately, I am not in a position to comment on this particular matter, but I can provide general information. It is standard Home Office policy neither to confirm nor deny the existence or receipt of a mutual legal assistance request. I can also make the following observation to clarify the non-case specific issues raised.

    Mutual legal assistance treaties are not just restricted to cases of international terrorism, kidnapping and money laundering. They can cover all types of crime or be crime specific. For example many states have treaties that relate solely to the issue of combating drug trafficking. Others, have all crime treaties, which provides a basis for mutual legal assistance generally. The treaty between the UK and the US is an all crimes treaty.

    I hope you find this useful

    Yours, Caroline


    Why does this matter?



    Qui bono?




    I'd like you to just pause a minute, hold onto your anger, and consider the things the Minister did feel able to write. She wrote "The treaty between the UK and the US is an all crimes treaty". Well, it may be. Blair's poodles may feel that it is fine for US government agents to walk jackbooted into any home in the United Kingdom in order to sieze such property as they see fit. But - allegedly - the FBI were not acting on behalf of the US government.

    Initial reports say that the FBI was acting on behalf of the Swiss government; later reports said, on behalf of an Italian court in Bologna. It scarcely matters. The point is that the Americans were not acting on their own behalf, so a treaty between the US and the UK should be moot.

    If the request came from a fellow member of the EC, why did the Metropolitan Police not not make the raid? If it was not legal for the Metropolitan Police, how could it be legal for a foreign power? And if it was legal for a foreign power, how come it was the FBI and not the Polizia?

    The suspicion in my mind is that there is no treaty in place which allows the police forces of fellow EU states to force their way into premises in the UK in order to sieze property. It would be intolerable if there were. And, indeed, can you imagine the headlines in the Daily Mail if it were even suggested?

    What recourse?



    As you'll know, I host on my personal website mirrors of censored documents which I consider important or valuable. I am my own ISP, and the server which hosts those documents is behind me as I write this, in my home. The documents I serve are censored in various jurisdictions around the world but inevitably the majority of them are censored in the United States. Suppose, at 4am one dark morning, I get a knock on the door and find myself faced with half a dozen burly Americans claiming to be from the FBI, what am I expected to do? What recourse have I if they choose to sieze my property? Who do I call to resist the invasion of my home by foreign forces? To whom do I complain?

    Civil and uncivil society



    Britain is, at least in theory, a democracy. Citizens (yes, my passport explicitly states I'm a 'British Citizen', not a 'British Subject') in theory freely discuss matters of politics and freely elect representatives to our national parliaments. Indymedia and organisations like it are a vital part of that process; they provide an means for unpopular opinions to be expressed, for events the mainstream media chooses to ignore to be reported. They give a voice to sections of our body politic which otherwise might not have one.

    We don't know, of course, why Indymedia's servers were seized. Caroline Flint won't even confirm (or deny) that they were seized. We can't see the order which authorised their seizure, because it's secret.

    But allegedly Indymedia's offence was that it published a photograph of an Italian policeman taking photographs of protesters at a G8 summit.

    So this is a very clear story about press freedom and press harassment; about an attempt by a foreign power to suppress free speech within the United Kingdom. We cannot conduct a civil society if we cannot freely communicate.

    The myth of sovereignty



    Part of the popular myth of Britain is that Britain is a sovereign nation. We cannot, we are repeatedly told, surrender that sovereignty to Brussels. Well, no, we can't; not now. We don't have it to surrender. What possible use are the civil protections of Scottish (or English) law if an American agent acting on behalf of an Italian court, without any due process in any United Kingdom court, without any warrant issued by any United Kingdom authority, can simply walk into my home and sieze my property? What possible protection can a United Kingdom government offer its people if a Minister of the Crown is unable even to 'confirm or deny' that this has happened?

    The truth is that Blair's Britain is not a sovereign nation. Not when the US President can order a movement of the Black Watch - a regiment of the British army - in order to help with his election campaign. Not when FBI agents can kick down any door in Britain without authorisation from the British courts and without a murmer - without a whimper - of protest from the UK 'government'. The truth is that Blair's Britain is no more than a satrapy of the American Imperium. Not so much a poodle as a cur to be kicked when it won't behave. A cur to be kicked when it won't grovel.

    Creative Commons Licence
    The fool on the hill by Simon Brooke is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License