Data.Map.unzip?

Joachim Breitner mail at joachim-breitner.de
Sat Dec 6 12:08:29 UTC 2014


Hi,


Am Samstag, den 06.12.2014, 12:08 +0100 schrieb Henning Thielemann:
> Consider the following example
> 
>    let (bigs,smalls) = unzip mix
>    in  do f bigs
>           g smalls
> 
> 'bigs' contains a great amount of data, and thus you prefer that it can be 
> garbage collected as 'f' consumes it. If 'unzip' is actually
> 
>   (fmap fst mix, fmap snd mix)
> 
> then 'mix' (and thus all big data) will be kept in memory until 'g' starts 
> processing.
> 

on a first glance, I agree. But I wouldn’t trust my first glance until I
investigated the issue. It also depends on the particular implementation
of unzip.

You assume that after `f bigs` is done, nothing references mix any more.
But `smalls` is going to be a thunk referencing mix, so mix will only be
GC’ed _while g is forcing smalls_, not earlier.

And if `g` is just a lookup in a tree, and `smalls` is used later as
well, then the parts of the tree not traversed will still reference
`mix`.

It might be possible to implement unzip in a way so that smalls gets
fully created while bigs is being forced. But that will require careful
code and analysis to verify that it works. And even then a partial
evaluation of `bigs` might break this expectation.

OTOH, the `(fmap fst mix, fmap snd mix)` might be more efficient if only
one component will actually be used and/or if `mix` is going to stay
live anyways. Who knows.

Greetings,
Joachim

-- 
Joachim “nomeata” Breitner
  mail at joachim-breitner.dehttp://www.joachim-breitner.de/
  Jabber: nomeata at joachim-breitner.de  • GPG-Key: 0xF0FBF51F
  Debian Developer: nomeata at debian.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://www.haskell.org/pipermail/libraries/attachments/20141206/79a6ff0d/attachment.sig>


More information about the Libraries mailing list