[Haskell-cafe] Glome.hs-0.3 and other things

Mon Apr 21 16:15:12 EDT 2008

David Roundy wrote:
> On Mon, Apr 21, 2008 at 11:54 AM, Andrew Coppin
> <andrewcoppin at btinternet.com> wrote:
>   
>>  I suppose the idea is that Haskell is supposed to help you work at a higher
>> level of abstraction, so you can concentrate on building better *algorithms*
>> which require less work in the first place. Surely using an algorithm in a
>> suitably superior complexity class could more than make up for the
>> performance loss from lack of cache coherence. But people who work in C and
>> C++ know how to build efficient algorithms too, so... hmm, we seem to be in
>> trouble here.
>>     
>
> I don't think things are so glum.  There are certainly things we can
> do to control where data is located (e.g. using unboxed arrays and
> keeping track of strictness, and strict data types combined with
> -funboxstrictidontrecalltheexactpragma).  In truth, C++ programs also
> tend to have severe issues with memory use.  Garbage collection is
> horrible on the cache, but it's also done relatively infrequently (at
> least, full GCs are).
>
> Writing seriously fast Haskell code is indeed challenging, but it's a
> fun challenge.  And writing bug-free code is also a fun challenge, and
> one at which we have a huge advantage over most other languages.  And
> in Haskell we less are forced to choose between a horribly ugly
> implementation and a horribly inefficient implementation.
>   

Well now, if you're interesting in bug-free code, Haskell looks to me 
like pretty much the supreme ultimate language. Can you spell "safe"? I 
think we've got it pretty much nailed on that score. About the most 
dangerous thing a typical Haskell program can do is "head []" or looping 
forever.

I do worry about performance, however. [I tend to write compute-bounded 
programs.] Sure, you can use arrays, but for whatever reason the API 
isn't nearly as nice as lists. (E.g., there is no zip function for 
arrays.) ByteString shows that arrays can be made fast - but ByteString 
is only for bytes.

It strikes me that an idiomatic Haskell program - which does things like 
"length . filter p" and such - is going to utterly confound cache 
prefetch logic with endless layers of indirection. The basic execution 
model of Haskell is graph reduction. As I understand it [it's been a 
while since I read about this], GHC uses the spineless tagless 
G-machine. This fundamentally uses pointers to functions for every 
aspect of its implementation - graph reduction, deadlock detection, GC, 
concurrency, etc. Every step of traversing a structure on the heap 
potentially involves forcing a thunk, so at every step you're 
dereferencing function pointers. It must utterly confound the 
instruction and data caches - 0% coherence! Using [unboxed] arrays makes 
the situation a little more bearable for the data cache, but the poor 
instruction cache still has to handle control flow that jumps somewhere 
every few dozen instructions...