[Haskell-cafe] Re: Grouping - Map / Reduce
huschi at gmx.org
Wed Mar 25 04:58:14 EDT 2009
the map can't be consumed while it is constructed. At any point during
its construction you don't know for any key in the map if it will appear
in the not cosumed rest of the list again. This means you can't process
any entry of the map because it may change later. The only point when
nothing will change anymore is when the map is completely constructed.
Günther Schmidt schrieb:
> Hi Ketil,
> Ketil Malde schrieb:
>> "Gü?nther Schmidt" <gue.schmidt at web.de> writes:
>>> let say I got an unordered lazy list of key/value pairs like
>>> [('a', 99), ('x', 42), ('a', 33) ... ]
>>> and I need to sum up all the values with the same keys.
>>> So far I wrote a naive implementation, using Data.Map, foldl and
>> Data.Map.fromListWith (+)
>>> The building of this map is of course a bottleneck, the successive
>>> processing needs to wait until the entire list is eventually consumed
>>> the Map is built and flattened again.
>> Sure this is not an artifact of the laziness of foldl?
> well I can't really see how the map could be consumed *while* it's still
> being built, I just don't see it. (I'm using foldl' and insertWith', sry
> for not saying so initially).
>>> Is there another way of doing this, something more "streaming
>>> architecture" like?
>> I don't see how you can do this much better - for a small, fixed set
>> of keys, you could use an (STU) array for the sums, but it depends if
>> the added complexity is worth it. You're already doing a single pass
>> over the data.
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
More information about the Haskell-Cafe