[Haskell-cafe] function result caching

Sat Oct 14 14:26:51 EDT 2006

On Saturday 14 October 2006 13:13, Ketil Malde wrote:
> Robert Dockins <robdockins at fastmail.fm> writes:
> >>> slowFunctionCacheList= [slowFunction (i) | i <-[0..5000000]]
> >>> and use "slowFunctionCacheList !! i" instead of "slowFunction (i)"
> >>
> >> Not much different in principle, but better in practice - you could
> >> use an array rather than a list.  O(1) lookups should make things (a
> >> lot) faster.
> >
> > Well, this is true only if the range of the domain function is small and
> > fairly dense.
>
> I don't think so.
>
> > With 5000000 elements, you're looking at allocating about 20Mb of
> > memory
>
> On the other hand, the lists allocates the 20Mb of pointers, *and*
> another 20Mb of cons cells for the lists.

True, but only if you access deeply into the tail of the list.  If one access 
only the first several hundred elements, say, then you'll only allocate the 
space needed for those.

Of course, if you only want to access a small portion at the begining, then 
why create such a big list in the first place?  Moral: lists will lose this 
contest in almost all cases.

> > to hold pointers to closures_ and then allocating and filling out 5000000
> > closures, all before you get down to doing any real work!
>
> If I interpret you correctly, you want to make the array's contents
> strict?  Not a good idea when the domain is sparse, but on the other
> hand it would let you unbox the contents, which means you'd only need
> to store the actual values. For boolean values, I think GHC
> stores one bit per value, i.e. less than a MB for this range.

No, I didn't suggest that the elements be strict.  That would involve 
precomputing the entire table.  You _could_ do that if you anticipate a LOT 
of access to sufficient to outweigh the initial cost.  But that seems 
unlikely for a sparse domain, as you mentioned.

However, even non-strict arrays are created all at once (ie, they are strict 
in their _structure_), and the closures have to be allocated as the array is 
being created.  Creating a closure isn't terribly expensive, but creating 
5000000 closures might take awhile and cost a lot of memory if the closure 
has a large number of free variables (which depends on the function 
definition and the exact details of the lambda lifter).  Also, large arrays 
tend to play havoc with GHC's garbage collector; it has to scan all elements 
on every major GC, IIRC.  That alone may offset any advantages won.

In the end, the only way to be sure which method is best is to test it against 
your usage profile.  My guess is that the array method will have enough 
overhead that it will lose against a tree.  However I may be wrong, 
especially if the program will have a very long runtime and if a "warm-up" 
period is acceptable.

> > Little-endian patricia trees [...]
>
> Yes, sure, if you can't afford a 20Mb index.  On the other hand,
> changing the function to use an array is a very small modification,
> and probably more than good enough in many cases.

I completely agree; it is good for many cases, and can be a very useful 
technique.  I just don't think it will be good for _this_ case (large, sparse 
domain where f(n) doesn't depend on all f(m) where m < n).  That probability 
is positively correlated with the size of the domain.  Again, the only way to 
really know is to implement and benchmark.  Thankfully, caching techniques 
are completely local and can be changed easily.

> -k

-- 
Rob Dockins

Talk softly and drive a Sherman tank.
Laugh hard, it's a long way to the bank.
       -- TMBG