[Haskell-cafe] Explicit garbage collection
Miguel Mitrofanov
miguelimo38 at yandex.ru
Thu Jan 7 17:03:38 EST 2010
Hmm, interesting. Seems like I have to take a closer look at GHC's
garbage collection. Thanks!
On 8 Jan 2010, at 00:53, Edward Kmett wrote:
> That is kind of what I thought you were doing.
>
> Retooling to see if we can get any better performance out of
> collecting, it seems that System.Mem.PerformGC does a foreign call
> out to performMajorGC to ask for a global collection. But I would
> hazard that you might be able to get most of the benefit by just
> asking for the nursery to be cleaned up.
>
> To that end, perhaps it would be worthwhile to consider switching to
> something like:
>
> foreign import ccall {-safe-} "performGC" performMinorGC :: IO ()
>
> -- System.Mem imports performMajorGC as 'performGC' but the minor
> collection appears to be called performGC, hence the rename.
>
> This should let you call for a minor collection, which should be
> considerably less time consuming. Especially as if you call it
> frequently you'll be dealing with a mostly cleared nursery anyways.
>
> -Edward Kmett
>
> On Thu, Jan 7, 2010 at 4:39 PM, Miguel Mitrofanov <miguelimo38 at yandex.ru
> > wrote:
> I liked it too. Seems like I have to show some real code, and my
> apologies for a long e-mail.
>
> Well, that's what I'm actually trying to do and what I've managed to
> achieve so far.
>
> > module Ask where
> > import Control.Monad
> > import Data.IORef
> > import Data.Maybe
> > import System.IO.Unsafe -- Yes, I would need that
> > import System.Mem
> > import System.Mem.Weak
>
> Suppose we're writing a RESTful server, which keeps all state on the
> client. We want it to send questions to the client and get back the
> answers. We can also send some state data to the client and we are
> sure that the client would return them back unchanged. So, that's
> the type of our server:
>
> > type Server medium = Maybe (medium, String) -> IO (Maybe (medium,
> String))
>
> The client starts interaction, sending "Nothing"; when server
> finally returns "Nothing", the session stops. I would emulate the
> client in GHCi with the following function:
>
> > client :: Show medium => Bool -> Server medium -> IO ()
> > client debug server = client' Nothing where
> > client' query =
> > do response <- server query
> > case response of
> > Nothing -> return ()
> > Just (medium, question) ->
> > do when debug $ putStr "Debug: " >> print medium
> > putStrLn question
> > putStr "--> "
> > answer <- getLine
> > client' $ Just (medium, answer)
>
> Nothing very interesting. The only thing to note is that the boolean
> argument allows us to see the state sent by server - for debugging
> purposes.
>
> But I want something more high-level. The goal is to solve the
> Graham's "arc challenge". So the high-level interface is implemented
> by this type:
>
> > data Ask a = Finish a | Continue String (String -> Ask a)
>
> A possible application would be:
>
> > test :: Ask ()
> > test =
> > do s1 <- ask "Enter the first number"
> > let n1 = read s1
> > s2 <- ask "Enter the second number"
> > let n2 = read s2
> > ask $ show $ n1 + n2
> > return ()
>
> Well, to do that, I need a Monad instance for Ask:
>
> > instance Monad Ask where
> > return = Finish
> > Finish x >>= h = h x
> > Continue question k >>= h = Continue question $ \answer -> k
> answer >>= h
>
> and an "ask" function:
>
> > ask :: String -> Ask String
> > ask question = Continue question $ \answer -> return answer
>
> Now, the problem is with the route from high level to the low level.
> We can do it relatively simply:
>
> > simpleServer :: Ask () -> Server [String]
> > simpleServer anything = return . simpleServer' anything . maybe []
> (\(medium, answer) -> medium ++ [answer]) where
> > simpleServer' (Finish ()) _ = Nothing
> > simpleServer' (Continue question _) [] = Just ([], question)
> > simpleServer' (Continue _ k) (s : ss) =
> > do (medium, question) <- simpleServer' (k s) ss
> > return (s : medium, question)
>
> And indeed, our test example works:
>
> *Ask> client False $ simpleServer test
> Enter the first number
> --> 3
> Enter the second number
> --> 4
> 7
> -->
>
> But, as you've probably guessed already, this "simpleServer" sends
> way too much information over network:
>
> > nonsense :: Ask ()
> > nonsense =
> > do a <- ask "?"
> > b <- ask a
> > c <- ask b
> > d <- ask c
> > e <- ask b -- Note this "b" instead of "d", it's deliberate.
> > return ()
>
> *Ask> client True $ simpleServer nonsense
> Debug: []
> ?
> --> a
> Debug: ["a"]
> a
> --> b
> Debug: ["a","b"]
> b
> --> c
> Debug: ["a","b","c"]
> c
> --> d
> Debug: ["a","b","c","d"]
> b
> --> e
>
> All user answers from the very beginning get transmitted.
>
> I want to transmit only those answers that are actually used. I'm
> gonna write another server, a (relatively) smart one.
>
> A simple boxing type:
>
> > data Box a = Box {unBox :: a}
>
> Now, the server. I use "Nothing" for those user answers that are no
> longer needed.
>
> > smartServer :: Ask () -> Server [Maybe String]
> > smartServer anything = smartServerIO anything . maybe [] (\
> (medium, answer) -> medium ++ [Just answer]) where
> > smartServerIO anything query =
>
> Sometimes a value is not used anymore just because it WAS used
> already. So I'll use a mutable list to keep track of those values
> that are actually dereferenced:
>
> > do usedList <- newIORef []
>
> Then, I'll make weak pointers for all user answers:
>
> > let boxedQuery = map Box query
> > weakPtrs <- mapM (\box -> mkWeak box () Nothing)
> boxedQuery
>
> and actually run my "Ask" value.
>
> > case findWeakPtrs 0 anything boxedQuery usedList of
> > Finish () -> return Nothing
> > Continue question rest ->
>
> Now, the interesting part. I'd invoke a garbage collector to get rid
> of those boxes that aren't necessary anymore.
>
> > do performGC
>
> Now, weak pointers tell me what boxes are not dereferenced but can
> be dereferenced in the future, and my mutable list has the numbers
> of boxes that were dereferenced.
>
> > danglingNumbers <- mapM deRefWeak weakPtrs
> > usedNumbers <- readIORef usedList
>
> I also need to keep this "future" alive, so that usable boxes don't
> get garbage collected.
>
> > case rest undefined of _ -> return ()
>
> And now all I need to do is to filter out those boxes that are not
> in any of the two lists:
>
> > return $ Just (zipWith3 (isUsed usedNumbers)
> [0..] danglingNumbers query, question)
>
> Helper function has an interesting detail:
>
> > findWeakPtrs _ (Finish ()) _ _ = Finish ()
> > findWeakPtrs _ response [] _ = response
>
> I use unsafePerformIO to track those boxes that get actually
> dereferenced:
>
> > findWeakPtrs n (Continue _ k) (s : ss) usedList =
> > findWeakPtrs (n+1) (k (unsafePerformIO $ modifyIORef
> usedList (n :) >> return (fromJust $ unBox s))) ss usedList
>
> A filtering function is not very interesting:
>
> > isUsed usedNumbers index Nothing _ | not (index `elem`
> usedNumbers) = Nothing
> > isUsed _ _ _ answer = answer
>
> And it works:
>
> *Ask> client True $ smartServer nonsense
> Debug: []
> ?
> --> a
> Debug: [Just "a"]
> a
> --> b
> Debug: [Nothing,Just "b"]
> b
> --> c
> Debug: [Nothing,Just "b",Just "c"]
> c
> --> d
> Debug: [Nothing,Just "b",Nothing,Nothing]
> b
> --> e
>
> So, the problem is, I don't think calling performGC every time is a
> good idea. I'd like to tell explicitly what values are to be garbage
> collected. It doesn't seem like it's possible with current version
> of GHC, though.
>
>
> On 7 Jan 2010, at 23:07, Edward Kmett wrote:
>
> Thats a shame, I was rather fond of this monad. To get that last
> 'True', you'll really have to rely on garbage collection to
> approximate reachability for you, leaving the weak reference
> solution the best you can hope for.
>
> -Edward Kmett
>
> On Thu, Jan 7, 2010 at 12:39 PM, Miguel Mitrofanov <miguelimo38 at yandex.ru
> > wrote:
> Damn. Seems like I really need (True, False, True) as a result of
> "test".
>
>
> On 7 Jan 2010, at 08:52, Miguel Mitrofanov wrote:
>
> Seems very nice. Thanks.
>
> On 7 Jan 2010, at 08:01, Edward Kmett wrote:
>
> Here is a slightly nicer version using the Codensity monad of STM.
>
> Thanks go to Andrea Vezzosi for figuring out an annoying hanging bug
> I was having.
>
> -Edward Kmett
>
> {-# LANGUAGE Rank2Types, GeneralizedNewtypeDeriving, DeriveFunctor #-}
> module STMOracle
> ( Oracle, Ref
> , newRef, readRef, writeRef, modifyRef, needRef
> ) where
>
> import Control.Applicative
> import Control.Monad
> import Control.Concurrent.STM
>
> instance Applicative STM where
> pure = return
> (<*>) = ap
>
> newtype Ref s a = Ref (TVar (Maybe a))
> newtype Oracle s a = Oracle { unOracle :: forall r. (a -> STM r) ->
> STM r } deriving (Functor)
>
> instance Monad (Oracle s) where
> return x = Oracle (\k -> k x)
> Oracle m >>= f = Oracle (\k -> m (\a -> unOracle (f a) k))
>
> mkOracle m = Oracle (m >>=)
>
> runOracle :: (forall s. Oracle s a) -> IO a
> runOracle t = atomically (unOracle t return)
>
> newRef :: a -> Oracle s (Ref s a)
> newRef a = mkOracle $ Ref <$> newTVar (Just a)
>
> readRef :: Ref s a -> Oracle s a
> readRef (Ref r) = mkOracle $ do
> m <- readTVar r
> maybe retry return m
>
> writeRef :: a -> Ref s a -> Oracle s a
> writeRef a (Ref r) = mkOracle $ do
> writeTVar r (Just a)
> return a
>
> modifyRef :: (a -> a) -> Ref s a -> Oracle s a
> modifyRef f r = do
> a <- readRef r
> writeRef (f a) r
>
> needRef :: Ref s a -> Oracle s Bool
> needRef (Ref slot) = Oracle $ \k ->
> (writeTVar slot Nothing >> k False)
> `orElse` k True
>
> -- test
> case
> : refMaybe
> b dflt ref = if b then readRef ref else return dflt
> refIgnore ref = return "blablabla"
> refFst ref = fst `fmap` readRef ref
> test = do
> a <- newRef "x"
> b <- newRef 1
> c <- newRef ('z', Just 0)
> -- no performLocalGC required
> x <- needRef a
> y <- needRef b
> z <- needRef c
> u <- refMaybe y "t" a -- note that it wouldn't actually read "a",
> -- but it won't be known until runtime.
> w <- refIgnore b
> v <- refFst c
> return (x, y, z)
>
>
>
> On Wed, Jan 6, 2010 at 10:28 PM, Edward Kmett <ekmett at gmail.com>
> wrote:
> I don't believe you can get quite the semantics you want. However,
> you can get reasonably close, by building a manual store and
> backtracking.
>
> {-# LANGUAGE Rank2Types #-}
> -- lets define an Oracle that tracks whether or not you might need
> the reference, by backtracking.
> module Oracle
> ( Oracle, Ref
> , newRef, readRef, writeRef, modifyRef, needRef
> ) where
>
> import Control.Applicative
> import Control.Arrow (first)
> import Control.Monad
> import Data.IntMap (IntMap)
> import qualified Data.IntMap as M
> import Unsafe.Coerce (unsafeCoerce)
> import GHC.Prim (Any)
>
> -- we need to track our own worlds, otherwise we'd have to build
> over ST, change optimistically, and track how to backtrack the state
> of the Store. Much uglier.
> -- values are stored as 'Any's for safety, see GHC.Prim for a
> discussion on the hazards of risking the storage of function types
> using unsafeCoerce as anything else.
> data World s = World { store :: !(IntMap Any), hwm :: !Int }
>
> -- references into our store
> newtype Ref s a = Ref Int deriving (Eq)
>
> -- our monad that can 'see the future' ~ StateT (World s) []newtype
> Oracle s a = Oracle { unOracle :: World s -> [(a, World s)] }
>
> -- we rely on the fact that the list is always non-empty for any
> oracle you can run. we are only allowed to backtrack if we thought
> we wouldn't need the reference, and wound up needing it, so head
> will always succeed.
> runOracle :: (forall s. Oracle s a) -> a
> runOracle f = fst $ head $ unOracle f $ World M.empty 1
>
>
> instance Monad (Oracle s) where
> return a = Oracle $ \w -> [(a,w)]
> Oracle m >>= k = Oracle $ \s -> do
> (a,s') <- m s
> unOracle (k a) s'
>
> -- note: you cannot safely define fail here without risking a crash
> in runOracle
> -- Similarly, we're not a MonadPlus instance because we always want
> to succeed eventually.
>
> instance Functor (Oracle s) where
> fmap f (Oracle g) = Oracle $ \w -> first f <$> g w
>
> instance Applicative (Oracle s) where
> pure = return
> (<*>) = ap
>
> -- new ref allocates a fresh slot and inserts the value into the
> store. the type level brand 's' keeps us safe, and we don't export
> the Ref constructor.
> newRef :: a -> Oracle s (Ref s a)
> newRef a = Oracle $ \(World w t) ->
> [(Ref t, World (M.insert t (unsafeCoerce a) w) (t + 1))]
>
> -- readRef is the only thing that ever backtracks, if we try to read
> a reference we claimed we wouldn't need, then we backtrack to when
> we decided we didn't need the reference, and continue with its value.
> readRef :: Ref s a -> Oracle s a
> readRef (Ref slot) = Oracle $ \world ->
> maybe [] (\a -> [(unsafeCoerce a, world)]) $ M.lookup slot (store
> world)
>
> -- note, writeRef dfoesn't 'need' the ref's current value, so
> needRef will report False if you writeRef before you read it after
> this.
> writeRef :: a -> Ref s a -> Oracle s a
> writeRef a (Ref slot) = Oracle $ \world ->
> [(a, world { store = M.insert slot (unsafeCoerce a) $ store
> world })]
>
> {-
> -- alternate writeRef where writing 'needs' the ref.
> writeRef :: a -> Ref s a -> Oracle s a
> writeRef a (Ref slot) = Oracle $ \World store v -> do
> (Just _, store') <- return $ updateLookupWithKey replace slot store
> [(a, World store' v)]
> where
> replace _ _ = Just (unsafeCoerce a)
> -}
>
> -- modifying a reference of course needs its current value.
> modifyRef :: (a -> a) -> Ref s a -> Oracle s a
> modifyRef f r = do
> a <- readRef r
> writeRef (f a) r
>
> -- needRef tries to continue executing the world without the element
> in the store in question. if that fails, then we'll backtrack to
> here, and try again with the original world, and report that the
> element was in fact needed.
> needRef :: Ref s a -> Oracle s Bool
> needRef (Ref slot) = Oracle $ \world ->
> [ (False, world { store = M.delete slot $ store world })
> , (True, world)
> ]
>
> -- test case:
> refMaybe b dflt ref = if b then readRef ref else return dflt
> refIgnore ref = return "blablabla"
> refFst ref = fst <$> readRef ref
> test = do
> a <- newRef "x"
> b <- newRef 1
> c <- newRef ('z', Just 0)
> -- no performLocalGC required
> x <- needRef a
> y <- needRef b
> z <- needRef c
> u <- refMaybe y "t" a -- note that it wouldn't actually read "a",
> -- but it won't be known until runtime.
> w <- refIgnore b
> v <- refFst c
> return (x, y, z)
>
> -- This will disagree with your desired answer, returning:
>
> *Oracle> runOracle test
> Loading package syb ... linking ... done.
> Loading package array-0.2.0.0 ... linking ... done.
> Loading package containers-0.2.0.1 ... linking ... done.
> (False,False,True)
>
> rather than (True, False, True), because the oracle is able to see
> into the future (via backtracking) to see that refMaybe doesn't use
> the reference after all.
>
> This probably won't suit your needs, but it was a fun little exercise.
>
> -Edward Kmett
>
> On Wed, Jan 6, 2010 at 4:05 PM, Miguel Mitrofanov <miguelimo38 at yandex.ru
> > wrote:
>
> On 6 Jan 2010, at 23:21, Edward Kmett wrote:
>
> You probably just want to hold onto weak references for your
> 'isStillNeeded' checks.
>
> That's what I do now. But I want to minimize the network traffic, so
> I want referenced values to be garbage collected as soon as possible
> - and I couldn't find anything except System.Mem.performIO to do the
> job - which is a bit too global for me.
>
> Otherwise the isStillNeeded check itself will keep you from garbage
> collecting!
>
> Not necessary. What I'm imagining is that there is essentially only
> one way to access the value stored in the reference - with readRef.
> So, if there isn't any chance that readRef would be called, the
> value can be garbage collected; "isStillNeeded" function only needs
> the reference, not the value.
>
> Well, yeah, that's kinda like weak references.
>
>
> http://cvs.haskell.org/Hugs/pages/libraries/base/System-Mem-Weak.html
>
> -Edward Kmett
>
> On Wed, Jan 6, 2010 at 9:39 AM, Miguel Mitrofanov <miguelimo38 at yandex.ru
> > wrote:
> I'll take a look at them.
>
> I want something like this:
>
> refMaybe b dflt ref = if b then readRef ref else return dflt
> refIgnore ref = return "blablabla"
> refFst ref =
> do
> (v, w) <- readRef ref
> return v
> test =
> do
> a <- newRef "x"
> b <- newRef 1
> c <- newRef ('z', Just 0)
> performLocalGC -- if necessary
> x <- isStillNeeded a
> y <- isStillNeeded b
> z <- isStillNeeded c
> u <- refMaybe y "t" a -- note that it wouldn't actually read "a",
> -- but it won't be known until runtime.
> w <- refIgnore b
> v <- refFst c
> return (x, y, z)
>
> so that "run test" returns (True, False, True).
>
>
> Dan Doel wrote:
> On Wednesday 06 January 2010 8:52:10 am Miguel Mitrofanov wrote:
> Is there any kind of "ST" monad that allows to know if some STRef is
> no
> longer needed?
>
> The problem is, I want to send some data to an external storage over a
> network and get it back later, but I don't want to send unnecessary
> data.
>
> I've managed to do something like that with weak pointers,
> System.Mem.performGC and unsafePerformIO, but it seems to me that
> invoking
> GC every time is an overkill.
>
> Oh, and I'm ready to trade the purity of runST for that, if necessary.
>
> You may be able to use something like Oleg's Lightweight Monadic
> Regions to get this effect. I suppose it depends somewhat on what
> qualifies a reference as "no longer needed".
>
> http://www.cs.rutgers.edu/~ccshan/capability/region-io.pdf
>
> I'm not aware of anything out-of-the-box that does what you want,
> though.
>
> -- Dan
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
More information about the Haskell-Cafe
mailing list