Friday, January 25, 2008

Registered Turing machine

The Turing machine is a wonderfully simple model of computation. But practically it is very awkward to program. Its biggest problem is the lack of random access and registers for temporary storage of results.

A register machine with RAM, that is, today's microprocessors, are convenient to program, but they have arbitrarily chosen size limits: the width of registers and the amount of RAM.

One would like to scale it up to the truly infinite (because of the infinite tape) model of computation that a Turing machine has. Here are some ideas.

Replace registers with stacks. Each stack element is a bit and the entire stack itself represents a value, possibly an address. Addresses may be arbitrarily large, because of the infinite memory available. Stacks have an advantage of tapes because its clear when you've reached the end of a stack, signifying, say the least-significat bit of the value on the stack. In order to manipulate the top and bottom bits easily (addition goes from LSB to MSB, while division goes the other way), it could be a double-ended stack with traditional stack operations available at both ends.

This is kind of a register machine where the fundamental data type is an infinite-width bit vector.

Random access into an infinite amount of memory seems a little bit absurd. It seems more realistic to have access time to be logarithmic with the length of the address. We may have a "jump left" (or right) instruction, telling the machine to move the tape head a distance to the left equal to the value stored in a particular stack register. Perhaps the logarithmic access time is already encoded in the fact that constructing the address itself already took logarithmic time.

Most modern languages have stack frames, and encoding those are awkward on a single tape. In order to hold more than one value on a stack, there are additional characters for the alphabet of the stack: "(", ")", along with "0" and "1". Perhaps the alphabet of the tape should also include them as well, so as to simplify the problem of encoding arbitrarily large values. It is up to the program to maintain consistency of what the characters mean, presumably trying to keep balanced parentheses.

We might have a few tapes (three or four) to allow algorithms like mergesort to work efficiently.

This ought to be enough rope for to develop compilers for high-level languages into this model of computation.

No comments :