[Haskell-cafe] weak pointers and memoization (was Re: memoization)

Rodney Price rodprice at raytheon.com
Thu Sep 17 19:39:01 EDT 2009


In my case, the results of each computation are used to generate a node
in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
that gets stored in the data structure after the computation of the
node finishes.  If I don't memoize the function to build a node, the
cost of generating the tree is exponential; if I do, it's somewhere
between linear and quadratic.

Another process prunes parts of this graph structure as time goes on.
The entire data structure is intended to be persistent, lasting for
days at a time in a server-like application.  If the parts pruned
aren't garbage collected, the space leak will eventually be
catastrophic.  Either the memo table or the graph structure itself will
outgrow available memory.

-Rod


On Thu, 17 Sep 2009 13:32:13 -0400
Job Vranish <jvranish at gmail.com> wrote:

> What are you trying to use this for? It seems to me that for memo
> tables you almost never have references to they keys outside the
> lookup table since the keys are usually computed right at the last
> minute, and then discarded (otherwise it might be easier to just
> cache stuff outside the function).
> 
> For example with a naive fibs, the values you are passing in are
> computed, and probably don't exist before you do the recursive call,
> and then are discarded shortly afterward.
> 
> It seems like putting a cap on the cache size, and then just
> overwriting old entries would be better.
> Am I missing something?
> 
> - Job
> 
> 
> 
> On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price <rodprice at raytheon.com>
> wrote:
> 
> > How does garbage collection work in an example like the one below?
> > You memoize a function with some sort of lookup table, which stores
> > function arguments as keys and function results as values.  As long
> > as the function remains in scope, the keys in the lookup table
> > remain in memory, which means that the keys themselves always
> > remain reachable and they cannot be garbage collected.  Right?
> >
> > So what do you do in the case where you know that, after some
> > period of time, some entries in the lookup table will never be
> > accessed?  That is, there are no references to the keys for some
> > entries remaining, except for the references in the lookup table
> > itself.  You'd like to allow the memory occupied by the keys to be
> > garbage collected.  Otherwise, if the function stays around for a
> > long time, the size of the lookup table always grows.  How do you
> > avoid the space leak?
> >
> > I notice that there is a function in Data.IORef,
> >
> > mkWeakIORef :: IORef a -> IO () -> IO (Weak (IORef a))
> >
> > which looks promising.  In the code below, however, there's only one
> > IORef, so either the entire table gets garbage collected or none of
> > it does.
> >
> > I've been reading the paper "Stretching the storage manager: weak
> > pointers and stable names in Haskell," which seems to answer my
> > question.  When I attempt to run the memoization code in the paper
> > on the simple fib example, I find that -- apparently due to lazy
> > evaluation -- no new entries are entered into the lookup table, and
> > therefore no lookups are ever successful!
> >
> > So apparently there is some interaction between lazy evaluation and
> > garbage collection that I don't understand.  My head hurts.  Is it
> > necessary to make the table lookup operation strict?  Or is it
> > something entirely different that I am missing?
> >
> > -Rod
> >
> >
> > On Thu, 10 Sep 2009 18:33:47 -0700
> > Ryan Ingram <ryani.spam at gmail.com> wrote:
> >
> > >
> > > memoIO :: Ord a => (a -> b) -> IO (a -> IO b)
> > > memoIO f = do
> > >    cache <- newIORef M.empty
> > >    return $ \x -> do
> > >        m <- readIORef cache
> > >        case M.lookup x m of
> > >            Just y -> return y
> > >            Nothing -> do let res = f x
> > >                          writeIORef cache $ M.insert x res m
> > >                          return res
> > >
> > > memo :: Ord a => (a -> b) -> (a -> b)
> > > memo f = unsafePerformIO $ do
> > >     fmemo <- memoIO f
> > >     return (unsafePerformIO . fmemo)
> > >
> > > I don't think there is any valid transformation that breaks this,
> > > since the compiler can't lift anything through unsafePerformIO.
> > > Am I mistaken?
> > >
> > >   -- ryan
> >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >



More information about the Haskell-Cafe mailing list