Replacement for GMP: Update
Peter Tanski
p.tanski at gmail.com
Fri Aug 11 18:56:57 EDT 2006
John,
> After all on the average call where an object of that
> size is free already it is a single array lookup, we have:
>
> (a) fetch pointer (one read)
> (b) fetch next (one read)
> (c) store next as current (one write)
This is true for memory access; it is not true for memory
allocation. I do not know how malloc allocates memory on Windows but
on general POSIX systems the kernel uses a linked list and lots of
other management things to reduce fragmentation, such KMEMSTAT.
Malloc may also block, which is something that you have more control
over in your own garbage collected heap. A really good explanation
for the problem of rapidly allocating and deallocating temporary
blocks of memory under 35kb is here: http://ridiculousfish.com/blog/
archives/2006/05/16/36/ .
In any case, Simon Marlow had previously mentioned that alloc (from
GHC's heap) is faster than malloc. He is almost certainly correct,
although I hope the difference will not be that great and the only
thing I have to worry about is ForeignPtr. We shall see whether
malloc-memory makes a difference in the benchmarks.
> A purely functional system -- one which does NOT convert
> self tail calls into jumps and reuse storage -- can
> perhaps be faster, since each thread can have its own
> local arena to allocate from (without need of any write
> barrier) .. however it isn't clear to me what the trade
> off is between cache misses and parallelism.
That is interesting but I do not understand whether your mention of
self tail calls turned into jumps was low or high level. From the
context it seems as if you are talking about a high level
implementation; each function running in a separate thread. GHC's
RTS does use many separate threads (the RTS is threaded by default
for the latest version, 6.6).
As for turning self tail calls into jumps at the low level, GHC does
do this through C-- (the GHC implementation of C-- is called Cmm). I
believe that is both faster and more memory efficient than a high
level threaded system. Philosophically speaking, even if Simon
Peyton-Jones developed Cmm to solve the problem of efficient
functional computations, Cmm has turned the research of Haskell from
research on a computer language to research on a system of
computation. (Maybe that is what he meant when some time ago he
wrote John Meacham and said that they (the GHC researchers)
considered compiling via C a dead end.) A language can be anything:
all it requires is a means of being translated into machine code;
pseudo-intelligent and advanced compiler systems such as gcc, JHC for
Haskell or OCaml for the Caml version of ML, may translate
programming languages into machine code but the underlying
computation remains largely sequential. The curious thing about GHC-
Haskell is that through the prism of Cmm, which enforces such things
as immutable variables and recursion right at the machine level,
Haskell is less a language of translation to sequential machine code
and more a description of a computational model. If you still think
I am wrong about this, consider the possibility that Haskell with Cmm
is a modern research project in the same concept that motivated Lisp:
a different model of computation.
Best regards,
Peter
More information about the Glasgow-haskell-users
mailing list