Containers and strictness

Felipe Lessa felipe.lessa at gmail.com
Fri Jun 25 17:48:59 EDT 2010


On Thu, Jun 24, 2010 at 09:28:15AM -0400, Edward Kmett wrote:
> On Thu, Jun 24, 2010 at 8:27 AM, Johan Tibell <johan.tibell at gmail.com>wrote:
> > The space overhead per key/value pair is 6 words (48 bytes on a 64-bit
> > architecture) when using lazy values but only 4 words (32 bytes) per
> > key/value pair when using strict (unpacked) values, a 50% difference. This
> > really starts to matter with big enough data sets (as seen in the recent
> > Twitter analysis thread). When work with Big Data it's often desirable to
> > fit as much data in RAM as possible as the result of many algorithms (think
> > machine learning or search ranking) differs with the amount of data you can
> > hold in memory.
> >
> > Something to consider.
> >
>
> I definitely agree that unboxing can help a great deal with performance and
> space utilization.
>
> However, as containers does not currently require any exotic extensions, I
> think that perhaps a type family -based generic map would belong in another
> 'unboxed-containers' or 'adaptive-containers' package (both of which
> currently exist on hackage), as it dramatically extends the language
> extension footprint of containers, taking it from something that easily runs
> across a wide array of Haskell implementations to something very
> ghc-specific.

If Map was strict in keys and values, we could have

  data Unstrict a = U a

to unstrictify the values.  I don't know if this a good solution,
though :).

Cheers,

--
Felipe.


More information about the Libraries mailing list