[Haskell-cafe] Explicit garbage collection

Edward Kmett ekmett at gmail.com
Thu Jan 7 16:53:09 EST 2010


That is kind of what I thought you were doing.

Retooling to see if we can get any better performance out of collecting, it
seems that System.Mem.PerformGC does a foreign call out to performMajorGC to
ask for a global collection. But I would hazard that you might be able to
get most of the benefit by just asking for the nursery to be cleaned up.

To that end, perhaps it would be worthwhile to consider switching to
something like:

foreign import ccall {-safe-} "performGC" performMinorGC :: IO ()

-- System.Mem imports performMajorGC as 'performGC' but the minor collection
appears to be called performGC, hence the rename.

This should let you call for a minor collection, which should be
considerably less time consuming. Especially as if you call it frequently
you'll be dealing with a mostly cleared nursery anyways.

-Edward Kmett

On Thu, Jan 7, 2010 at 4:39 PM, Miguel Mitrofanov <miguelimo38 at yandex.ru>wrote:

> I liked it too. Seems like I have to show some real code, and my apologies
> for a long e-mail.
>
> Well, that's what I'm actually trying to do and what I've managed to
> achieve so far.
>
> > module Ask where
> > import Control.Monad
> > import Data.IORef
> > import Data.Maybe
> > import System.IO.Unsafe -- Yes, I would need that
> > import System.Mem
> > import System.Mem.Weak
>
> Suppose we're writing a RESTful server, which keeps all state on the
> client. We want it to send questions to the client and get back the answers.
> We can also send some state data to the client and we are sure that the
> client would return them back unchanged. So, that's the type of our server:
>
> > type Server medium = Maybe (medium, String) -> IO (Maybe (medium,
> String))
>
> The client starts interaction, sending "Nothing"; when server finally
> returns "Nothing", the session stops. I would emulate the client in GHCi
> with the following function:
>
> > client :: Show medium => Bool -> Server medium -> IO ()
> > client debug server = client' Nothing where
> >     client' query =
> >         do response <- server query
> >            case response of
> >              Nothing -> return ()
> >              Just (medium, question) ->
> >                  do when debug $ putStr "Debug: " >> print medium
> >                     putStrLn question
> >                     putStr "--> "
> >                     answer <- getLine
> >                     client' $ Just (medium, answer)
>
> Nothing very interesting. The only thing to note is that the boolean
> argument allows us to see the state sent by server - for debugging purposes.
>
> But I want something more high-level. The goal is to solve the Graham's
> "arc challenge". So the high-level interface is implemented by this type:
>
> > data Ask a = Finish a | Continue String (String -> Ask a)
>
> A possible application would be:
>
> > test :: Ask ()
> > test =
> >     do s1 <- ask "Enter the first number"
> >        let n1 = read s1
> >        s2 <- ask "Enter the second number"
> >        let n2 = read s2
> >        ask $ show $ n1 + n2
> >        return ()
>
> Well, to do that, I need a Monad instance for Ask:
>
> > instance Monad Ask where
> >     return = Finish
> >     Finish x >>= h = h x
> >     Continue question k >>= h = Continue question $ \answer -> k answer
> >>= h
>
> and an "ask" function:
>
> > ask :: String -> Ask String
> > ask question = Continue question $ \answer -> return answer
>
> Now, the problem is with the route from high level to the low level. We can
> do it relatively simply:
>
> > simpleServer :: Ask () -> Server [String]
> > simpleServer anything = return . simpleServer' anything . maybe []
> (\(medium, answer) -> medium ++ [answer]) where
> >     simpleServer' (Finish ()) _ = Nothing
> >     simpleServer' (Continue question _) [] = Just ([], question)
> >     simpleServer' (Continue _ k) (s : ss) =
> >         do (medium, question) <- simpleServer' (k s) ss
> >            return (s : medium, question)
>
> And indeed, our test example works:
>
> *Ask> client False $ simpleServer test
> Enter the first number
> --> 3
> Enter the second number
> --> 4
> 7
> -->
>
> But, as you've probably guessed already, this "simpleServer" sends way too
> much information over network:
>
> > nonsense :: Ask ()
> > nonsense =
> >     do a <- ask "?"
> >        b <- ask a
> >        c <- ask b
> >        d <- ask c
> >        e <- ask b -- Note this "b" instead of "d", it's deliberate.
> >        return ()
>
> *Ask> client True $ simpleServer nonsense
> Debug: []
> ?
> --> a
> Debug: ["a"]
> a
> --> b
> Debug: ["a","b"]
> b
> --> c
> Debug: ["a","b","c"]
> c
> --> d
> Debug: ["a","b","c","d"]
> b
> --> e
>
> All user answers from the very beginning get transmitted.
>
> I want to transmit only those answers that are actually used. I'm gonna
> write another server, a (relatively) smart one.
>
> A simple boxing type:
>
> > data Box a = Box {unBox :: a}
>
> Now, the server. I use "Nothing" for those user answers that are no longer
> needed.
>
> > smartServer :: Ask () -> Server [Maybe String]
> > smartServer anything = smartServerIO anything . maybe [] (\(medium,
> answer) -> medium ++ [Just answer]) where
> >     smartServerIO anything query =
>
> Sometimes a value is not used anymore just because it WAS used already. So
> I'll use a mutable list to keep track of those values that are actually
> dereferenced:
>
> >         do usedList <- newIORef []
>
> Then, I'll make weak pointers for all user answers:
>
> >            let boxedQuery = map Box query
> >            weakPtrs <- mapM (\box -> mkWeak box () Nothing) boxedQuery
>
> and actually run my "Ask" value.
>
> >            case findWeakPtrs 0 anything boxedQuery usedList of
> >              Finish () -> return Nothing
> >              Continue question rest ->
>
> Now, the interesting part. I'd invoke a garbage collector to get rid of
> those boxes that aren't necessary anymore.
>
> >                  do performGC
>
> Now, weak pointers tell me what boxes are not dereferenced but can be
> dereferenced in the future, and my mutable list has the numbers of boxes
> that were dereferenced.
>
> >                     danglingNumbers <- mapM deRefWeak weakPtrs
> >                     usedNumbers <- readIORef usedList
>
> I also need to keep this "future" alive, so that usable boxes don't get
> garbage collected.
>
> >                     case rest undefined of _ -> return ()
>
> And now all I need to do is to filter out those boxes that are not in any
> of the two lists:
>
> >                     return $ Just (zipWith3 (isUsed usedNumbers) [0..]
> danglingNumbers query, question)
>
> Helper function has an interesting detail:
>
> >     findWeakPtrs _ (Finish ()) _ _ = Finish ()
> >     findWeakPtrs _ response [] _ = response
>
> I use unsafePerformIO to track those boxes that get actually dereferenced:
>
> >     findWeakPtrs n (Continue _ k) (s : ss) usedList =
> >         findWeakPtrs (n+1) (k (unsafePerformIO $ modifyIORef usedList (n
> :) >> return (fromJust $ unBox s))) ss usedList
>
> A filtering function is not very interesting:
>
> >     isUsed usedNumbers index Nothing _ | not (index `elem` usedNumbers) =
> Nothing
> >     isUsed _ _ _ answer = answer
>
> And it works:
>
> *Ask> client True $ smartServer nonsense
> Debug: []
> ?
> --> a
> Debug: [Just "a"]
> a
> --> b
> Debug: [Nothing,Just "b"]
> b
> --> c
> Debug: [Nothing,Just "b",Just "c"]
> c
> --> d
> Debug: [Nothing,Just "b",Nothing,Nothing]
> b
> --> e
>
> So, the problem is, I don't think calling performGC every time is a good
> idea. I'd like to tell explicitly what values are to be garbage collected.
> It doesn't seem like it's possible with current version of GHC, though.
>
>
> On 7 Jan 2010, at 23:07, Edward Kmett wrote:
>
>  Thats a shame, I was rather fond of this monad. To get that last 'True',
>> you'll really have to rely on garbage collection to approximate reachability
>> for you, leaving the weak reference solution the best you can hope for.
>>
>> -Edward Kmett
>>
>> On Thu, Jan 7, 2010 at 12:39 PM, Miguel Mitrofanov <miguelimo38 at yandex.ru>
>> wrote:
>> Damn. Seems like I really need (True, False, True) as a result of "test".
>>
>>
>> On 7 Jan 2010, at 08:52, Miguel Mitrofanov wrote:
>>
>> Seems very nice. Thanks.
>>
>> On 7 Jan 2010, at 08:01, Edward Kmett wrote:
>>
>> Here is a slightly nicer version using the Codensity monad of STM.
>>
>> Thanks go to Andrea Vezzosi for figuring out an annoying hanging bug I was
>> having.
>>
>> -Edward Kmett
>>
>> {-# LANGUAGE Rank2Types, GeneralizedNewtypeDeriving, DeriveFunctor #-}
>> module STMOracle
>>  ( Oracle, Ref
>>  , newRef, readRef, writeRef, modifyRef, needRef
>>  ) where
>>
>> import Control.Applicative
>> import Control.Monad
>> import Control.Concurrent.STM
>>
>> instance Applicative STM where
>>  pure = return
>>  (<*>) = ap
>>
>> newtype Ref s a = Ref (TVar (Maybe a))
>> newtype Oracle s a = Oracle { unOracle :: forall r. (a -> STM r) -> STM r
>> } deriving (Functor)
>>
>> instance Monad (Oracle s) where
>>  return x = Oracle (\k -> k x)
>>  Oracle m >>= f = Oracle (\k -> m (\a -> unOracle (f a) k))
>>
>> mkOracle m = Oracle (m >>=)
>>
>> runOracle :: (forall s. Oracle s a) -> IO a
>> runOracle t = atomically (unOracle t return)
>>
>> newRef :: a -> Oracle s (Ref s a)
>> newRef a = mkOracle $ Ref <$> newTVar (Just a)
>>
>> readRef :: Ref s a -> Oracle s a
>> readRef (Ref r) = mkOracle $ do
>>  m <- readTVar r
>>  maybe retry return m
>>
>> writeRef :: a -> Ref s a -> Oracle s a
>> writeRef a (Ref r) = mkOracle $ do
>>  writeTVar r (Just a)
>>  return a
>>
>> modifyRef :: (a -> a) -> Ref s a -> Oracle s a
>> modifyRef f r = do
>>  a <- readRef r
>>  writeRef (f a) r
>>
>> needRef :: Ref s a -> Oracle s Bool
>> needRef (Ref slot) = Oracle $ \k ->
>>           (writeTVar slot Nothing >> k False)
>>  `orElse` k True
>>
>> -- test case:
>>                                                                   refMaybe b
>> dflt ref = if b then readRef ref else return dflt
>> refIgnore ref = return "blablabla"
>> refFst ref = fst `fmap` readRef ref
>> test = do
>>   a <- newRef "x"
>>   b <- newRef 1
>>   c <- newRef ('z', Just 0)
>>   -- no performLocalGC required
>>   x <- needRef a
>>   y <- needRef b
>>   z <- needRef c
>>   u <- refMaybe y "t" a -- note that it wouldn't actually read "a",
>>                         -- but it won't be known until runtime.
>>   w <- refIgnore b
>>   v <- refFst c
>>   return (x, y, z)
>>
>>
>>
>> On Wed, Jan 6, 2010 at 10:28 PM, Edward Kmett <ekmett at gmail.com> wrote:
>> I don't believe you can get quite the semantics you want. However, you can
>> get reasonably close, by building a manual store and backtracking.
>>
>> {-# LANGUAGE Rank2Types #-}
>> -- lets define an Oracle that tracks whether or not you might need the
>> reference, by backtracking.
>> module Oracle
>>  ( Oracle, Ref
>>  , newRef, readRef, writeRef, modifyRef, needRef
>>  ) where
>>
>> import Control.Applicative
>> import Control.Arrow (first)
>> import Control.Monad
>> import Data.IntMap (IntMap)
>> import qualified Data.IntMap as M
>> import Unsafe.Coerce (unsafeCoerce)
>> import GHC.Prim (Any)
>>
>> -- we need to track our own worlds, otherwise we'd have to build over ST,
>> change optimistically, and track how to backtrack the state of the Store.
>> Much uglier.
>> -- values are stored as 'Any's for safety, see GHC.Prim for a discussion
>> on the hazards of risking the storage of function types using unsafeCoerce
>> as anything else.
>> data World s = World { store :: !(IntMap Any), hwm :: !Int }
>>
>> -- references into our store
>> newtype Ref s a = Ref Int deriving (Eq)
>>
>> -- our monad that can 'see the future' ~ StateT (World s) []newtype Oracle
>> s a = Oracle { unOracle :: World s -> [(a, World s)] }
>>
>> -- we rely on the fact that the list is always non-empty for any oracle
>> you can run. we are only allowed to backtrack if we thought we wouldn't need
>> the reference, and wound up needing it, so head will always succeed.
>> runOracle :: (forall s. Oracle s a) -> a
>> runOracle f = fst $ head $ unOracle f $ World M.empty 1
>>
>>
>> instance Monad (Oracle s) where
>>  return a = Oracle $ \w -> [(a,w)]
>>  Oracle m >>= k = Oracle $ \s -> do
>>      (a,s') <- m s
>>      unOracle (k a) s'
>>
>> -- note: you cannot safely define fail here without risking a crash in
>> runOracle
>> -- Similarly, we're not a MonadPlus instance because we always want to
>> succeed eventually.
>>
>> instance Functor (Oracle s) where
>>  fmap f (Oracle g) = Oracle $ \w -> first f <$> g w
>>
>> instance Applicative (Oracle s) where
>>  pure = return
>>  (<*>) = ap
>>
>> -- new ref allocates a fresh slot and inserts the value into the store.
>> the type level brand 's' keeps us safe, and we don't export the Ref
>> constructor.
>> newRef :: a -> Oracle s (Ref s a)
>> newRef a = Oracle $ \(World w t) ->
>>  [(Ref t, World (M.insert t (unsafeCoerce a) w) (t + 1))]
>>
>> -- readRef is the only thing that ever backtracks, if we try to read a
>> reference we claimed we wouldn't need, then we backtrack to when we decided
>> we didn't need the reference, and continue with its value.
>> readRef :: Ref s a -> Oracle s a
>> readRef (Ref slot) = Oracle $ \world ->
>>  maybe [] (\a -> [(unsafeCoerce a, world)]) $ M.lookup slot (store world)
>>
>> -- note, writeRef dfoesn't 'need' the ref's current value, so needRef will
>> report False if you writeRef before you read it after this.
>> writeRef :: a -> Ref s a -> Oracle s a
>> writeRef a (Ref slot) = Oracle $ \world ->
>>      [(a, world { store = M.insert slot (unsafeCoerce a) $ store world })]
>>
>> {-
>> -- alternate writeRef where writing 'needs' the ref.
>> writeRef :: a -> Ref s a -> Oracle s a
>> writeRef a (Ref slot) = Oracle $ \World store v -> do
>>  (Just _, store') <- return $ updateLookupWithKey replace slot store
>>  [(a, World store' v)]
>>  where
>>  replace _ _ = Just (unsafeCoerce a)
>> -}
>>
>> -- modifying a reference of course needs its current value.
>> modifyRef :: (a -> a) -> Ref s a -> Oracle s a
>> modifyRef f r = do
>>  a <- readRef r
>>  writeRef (f a) r
>>
>> -- needRef tries to continue executing the world without the element in
>> the store in question. if that fails, then we'll backtrack to here, and try
>> again with the original world, and report that the element was in fact
>> needed.
>> needRef :: Ref s a -> Oracle s Bool
>> needRef (Ref slot) = Oracle $ \world ->
>>  [ (False, world { store = M.delete slot $ store world })
>>  , (True, world)
>>  ]
>>
>> -- test case:
>> refMaybe b dflt ref = if b then readRef ref else return dflt
>> refIgnore ref = return "blablabla"
>> refFst ref = fst <$> readRef ref
>> test = do
>>   a <- newRef "x"
>>   b <- newRef 1
>>   c <- newRef ('z', Just 0)
>>   -- no performLocalGC required
>>   x <- needRef a
>>   y <- needRef b
>>   z <- needRef c
>>   u <- refMaybe y "t" a -- note that it wouldn't actually read "a",
>>                         -- but it won't be known until runtime.
>>   w <- refIgnore b
>>   v <- refFst c
>>   return (x, y, z)
>>
>> -- This will disagree with your desired answer, returning:
>>
>> *Oracle> runOracle test
>> Loading package syb ... linking ... done.
>> Loading package array-0.2.0.0 ... linking ... done.
>> Loading package containers-0.2.0.1 ... linking ... done.
>> (False,False,True)
>>
>> rather than (True, False, True), because the oracle is able to see into
>> the future (via backtracking) to see that refMaybe doesn't use the reference
>> after all.
>>
>> This probably won't suit your needs, but it was a fun little exercise.
>>
>> -Edward Kmett
>>
>> On Wed, Jan 6, 2010 at 4:05 PM, Miguel Mitrofanov <miguelimo38 at yandex.ru>
>> wrote:
>>
>> On 6 Jan 2010, at 23:21, Edward Kmett wrote:
>>
>> You probably just want to hold onto weak references for your
>> 'isStillNeeded' checks.
>>
>> That's what I do now. But I want to minimize the network traffic, so I
>> want referenced values to be garbage collected as soon as possible - and I
>> couldn't find anything except System.Mem.performIO to do the job - which is
>> a bit too global for me.
>>
>> Otherwise the isStillNeeded check itself will keep you from garbage
>> collecting!
>>
>> Not necessary. What I'm imagining is that there is essentially only one
>> way to access the value stored in the reference - with readRef. So, if there
>> isn't any chance that readRef would be called, the value can be garbage
>> collected; "isStillNeeded" function only needs the reference, not the value.
>>
>> Well, yeah, that's kinda like weak references.
>>
>>
>> http://cvs.haskell.org/Hugs/pages/libraries/base/System-Mem-Weak.html
>>
>> -Edward Kmett
>>
>> On Wed, Jan 6, 2010 at 9:39 AM, Miguel Mitrofanov <miguelimo38 at yandex.ru>
>> wrote:
>> I'll take a look at them.
>>
>> I want something like this:
>>
>> refMaybe b dflt ref = if b then readRef ref else return dflt
>> refIgnore ref = return "blablabla"
>> refFst ref =
>> do
>>  (v, w) <- readRef ref
>>  return v
>> test =
>> do
>>  a <- newRef "x"
>>  b <- newRef 1
>>  c <- newRef ('z', Just 0)
>>  performLocalGC -- if necessary
>>  x <- isStillNeeded a
>>  y <- isStillNeeded b
>>  z <- isStillNeeded c
>>  u <- refMaybe y "t" a -- note that it wouldn't actually read "a",
>>                        -- but it won't be known until runtime.
>>  w <- refIgnore b
>>  v <- refFst c
>>  return (x, y, z)
>>
>> so that "run test" returns (True, False, True).
>>
>>
>> Dan Doel wrote:
>> On Wednesday 06 January 2010 8:52:10 am Miguel Mitrofanov wrote:
>> Is there any kind of "ST" monad that allows to know if some STRef is no
>> longer needed?
>>
>> The problem is, I want to send some data to an external storage over a
>> network and get it back later, but I don't want to send unnecessary data.
>>
>> I've managed to do something like that with weak pointers,
>> System.Mem.performGC and unsafePerformIO, but it seems to me that invoking
>> GC every time is an overkill.
>>
>> Oh, and I'm ready to trade the purity of runST for that, if necessary.
>>
>> You may be able to use something like Oleg's Lightweight Monadic Regions
>> to get this effect. I suppose it depends somewhat on what qualifies a
>> reference as "no longer needed".
>>
>> http://www.cs.rutgers.edu/~ccshan/capability/region-io.pdf<http://www.cs.rutgers.edu/%7Eccshan/capability/region-io.pdf>
>>
>> I'm not aware of anything out-of-the-box that does what you want, though.
>>
>> -- Dan
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20100107/2d353ec2/attachment.html


More information about the Haskell-Cafe mailing list