> Thorkil Naur and others have suggested writing the whole   
> thing as small assembler operations and piece them together in  
> Haskell; I have been looking into that as well but it seems to entail  
> inlining every Integer function--imagine the code bloat.
Just a quick adjustment: I have suggested writing raw operations in 
(hopefully) portable C. And although I would consider eliminating some of the 
levels of calls (from the compiled Haskell code via Num.lhs and the handcoded 
PrimOps.cmm to the specific C function implementing the desired operation), I 
agree that inlining the entire function implementing the operation would be 

