[Haskell-cafe] Imagining a G-machine

Wed May 16 18:25:48 EDT 2007

A native G-machine --- physical, or chemical, or biological, but not a 
repressed simulation over the imperative cpu-memory architecture --- is 
the dream of every lazy-functional programmer of great devotion. If only 
it became the dominant computing architecture! People would say, Haskell 
is high-level assembly because it is the closest to the bare metal (or 
bare protein) and still supports all sorts of abstractions. People would 
ask, why is my C program five times slower than Haskell, and people 
would answer, that's because C is compiled to a stack of seven monad 
transformers (one of which is ContT to support longjmp), and because you 
should use more linked lists and fewer arrays. We truely rule the world 
when C programmers have to consult us for performance tips.

A necessary condition for dominance is existence. Here is my crazy 
imagination of a native G-machine. It is worded as biotech but may as 
well be molecular computing or nanotech.

The heap is a soup of nutrients. A node is a molecular structure made 
from the nutrients. A thunk is a lot of nodes linked up. Nodes and 
thunks are suspended in the soup. Allocation means constructing nodes 
and links from nutrients, and deallocation means unlinking nodes and 
eventually dismantling them back into nutrients. As a side effect, 
memory upgrade is cheap and convenient: Just go buy a can of Campbell 
soup (my favourite is the alphabet soup) and pour into your heap. There 
are 24-hour convenient stores at every street corner --- pervasive 
computing has never been so pervasive before.

A thunk is nodes linked together and suspended in the soup. A theorem in 
graph theory asserts that all finite graphs can be embedded without 
crossing edges in 3D space, and we take advantage of this to space out 
the nodes in a complicated thunk. Still, it is not easy, being NP-hard 
and all. There is a relaxation heuristic for this: It simply requires 
nodes to be a bit repulsive to each other and links to be elastic, and 
they will sort things out themselves. (When they don't, a bit of shaking 
up helps tremendously.)

Evaluation is performed by a small organism crawling over a thunk and 
performing its thunk-rewriting duty as per the G-machine. It applies 
enzymes to construct and deconstruct nodes and links from nutrients. It 
performs primitive arithmetic on its own. It maintains its own stack, 
inside or outside its body (to be investigated). It is possible to 
unleash several such evaluators for parallel computing. It is possible 
to enable reproduction to boost parallelism.

To add garbage collection, roots send out a periodic (or sustained) 
signal to all connected nodes. Nodes receiving the signal do not 
self-destruct. Nodes not receiving the signal invokes their built-in 
self-destruct mechanism to dissolve themselves back into nutrients. 
There may be better schemes.

Interacting with the outside world is open, but Andrew Gordon's thesis 
contains translations from monadic I/O to other I/O schemes, e.g., 
continuations, streams, etc. Perhaps one of them can be adapted.

Debugging can be done by making evaluators send radio signals concerning 
operations they perform; then a second computer can log these and you 
can review them. You can also use radio signals to instruct the 
evaluators to perform unusual operations on demand.

The existence of a native G-machine is a necessary but not sufficient 
condition for world dominance. We still need to make it fast, compact, 
and cheap. There may also be better but totally different ways to 
realize a native G-machine.