[Haskell-cafe] is 256M RAM insufficient for a 20 millionelement Int/Int map?

Don Stewart dons at galois.com
Sun Oct 19 19:05:27 EDT 2008


claus.reinke:
> >>> I have a standard Data.Map.Map as the base structure for one of my
> >>> macid data tables (jobs), but I noticed something
> >>> that is probably causing problems for me.
> >>> Even a simple 20 million record with int/int key values causes an out
> >>> of memory error for me in ghci,
> >>Int keys, Int values eh?
> >>Does using IntMap help?
> >Interesting. Map Int Int, IntMap Int and [(Int, Int)] use pretty much
> >the same amount of memory, assuming that no keys or values are shared:
> 
> I haven't followed the thread, but one thing that keeps tripping me
> up wrt memory use is that Maps aren't strict in their values by default.
> 
> Worse, if that is not what you want for your application, working
> around it can be rather awkward (Data.Map at least provides
> insertWith', but there's no unionWith', and Data.IntMap doesn't
> even have insertWith'). 
> 
> Ideally, I'd just like to indicate the strictness in the types (hint to ghc 
> hackers: 'Data.Map.Map Int !Int' and '[!a]' would really be useful!-), 
> but as that isn't supported, separate operations (with seq inserted 
> manually at suitable places) seem necessary (hint to container library 
> maintainers: please provide full set of strict operation variants!-).
> 
> If this is (or contributes to) the problem, it should show up in heap
> profiles as an upward ramp and could be fixed by defining and using
> locally augmented versions of the Map operations. 

I'd like them strict and specialised,

So that:

    data IntMap a = Nil
                  | Tip {-# UNPACK #-} !Key a
                  | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) 

applied as so,

    type T = IntMap {-# UNPACK #-} !Int

would be equivalent to

    data IntMapT = Nil
                 | Tip {-# UNPACK #-} !Key {-# UNPACK #-} !Int 
                 | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) 

where we've avoided an indirection in the Tip nodes. Less space, faster access.

In general, being able to specialise polymorphic structures so they look like unpacked
monomorphic ones would be awesome.

    (!Int, !Bool)     ->   (,) {-# UNPACK #-}!Int {-# UNPACK #-}!Bool

-- Don


More information about the Haskell-Cafe mailing list