how and where to {-# specialize #-} ?

Alastair Reid alastair@reid-consulting-uk.ltd.uk
Wed, 25 Jun 2003 10:18:17 +0100


> | It is sad that the usage of libraries containing polymorphic code
> | [...]
> | seems to imply runtime overheads, by preventing specialisation.

> I agree that it is sad.  The only way around it is to ship libraries
> with *all* their source code (perhaps hidden in the interface file).
> That could be done, but it'd be Real Work

One way of tackling it would be to change ghc so that compilation doesn't 
produce .o files containing machine code but, rather, files containing a 
concise encoding of STG code.  One possible concise encoding would be as 
bytecodes.  (With just a little care over their design, you can easily 
uncompile bytecodes to recover the original code so we're not losing any 
information.)

Obviously, ghci could interpret this code directly but it would take little 
more work to have ghci compile the bytecode to machine code.  There's been 
lots of research into how to do this effectively by both the Forth and the 
Java communities.  This case ought to be easier because STG code has already 
had a very powerful optimizer applied to it so all that's left is doing a 
good job with register allocation and pipelining.

That same compilation engine could also be used in linking: read a bunch of 
bytecodes, compile to machine code, write out a .o file to be linked against 
the runtime and any C libraries.  The 'unoptimized, portable' variant of this 
could either generate C code for compilation by gcc or, easier still, leave 
them as bytecodes and include a bytecode interpreter in the system.

Benefits:

I think the result would be more portable (because there is always the 
interpreter to fall back on).  Instead of the ghc folks shipping a bunch of 
.hc files for porting purposes, they would ship a load of bytecode files.
[Of course, it won't eliminate all porting issues: Windows and Solaris differ 
in more ways than just their processor...]

This technology would support things like template Haskell and persistence 
research nicely.

An optimizer can easily get hold of any source code it needs by dipping into 
the compiled files when it wants and decompiling the bytecodes.

Of course, none of this negates Simon PJ's comment that it would be Real Work.

--
Alastair Reid