wren ng thornton wren at freegeek.org
Sat Jul 31 16:29:22 EDT 2010

Claude Heiland-Allen wrote:
> On 31/07/10 12:13, wren ng thornton wrote:
>> Stephen Tetley wrote:
>>> wren ng thornton wrote:
>>>> Ben wrote:
>>>>
>>>>> unzipMap :: M.Map a (b, c) -> (M.Map a b, M.Map a c)
>>>>> unzipMap m = (M.map fst m, M.map snd m)
>>>>
>>>> I don't think you can give a more efficient implementation using the
>>>> public
>>>> interface of Data.Map. You need to have a sort of mapping function that
>>>> allows you to thread them together, either via continuations or via a
>>>> primitive:
>>>
>>> Unless I'm missing something. This one has one traversal...
>>>
>>> unzipMap :: Ord a => M.Map a (b, c) -> (M.Map a b, M.Map a c)
>>> unzipMap = M.foldrWithKey fn (M.empty,M.empty)
>>> where
>>> fn k a (m1,m2) = (M.insert k (fst a) m1, M.insert k (snd a) m2)
>>
>> Well, that's one traversal of the original map, but you have to traverse
>> the new maps repeatedly with all those M.insert calls. And since
>> Data.Map is a balanced tree, that could lead to a whole lot of work
>> rebalancing things.
>>
>> However, because we are not altering the set of keys, we are guaranteed
>> that the structure of both new maps will be identical to the structure
>> of the old map. Therefore, with the right primitives, we can keep one
>> finger in each of the three maps and traverse them all in parallel
>> without re-traversing any part of the spine. (The Either and Or variants
>> will have some retraversal as the smart constructors prune out the spine
>> leading to deleted keys. But this is, arguably, necessary.)
>
> Why not something like this (with the correctness proof as an exercise):
>
> \begin{code}
>
> import Data.Map (Map)
> import qualified Data.Map as M
>
> unzipMap :: Map a (b, c) -> (Map a b, Map a c)
> unzipMap m =
>   let (ab, ac) = unzip . map fiddle . M.toAscList \$ m
>   in  (M.fromDistinctAscList ab, M.fromDistinctAscList ac)
>   where
>     fiddle :: (x, (y, z)) -> ((x, y), (x, z))
>     fiddle (x, (y, z)) = ((x, y), (x, z))
>
> \end{code}

That O(n)+O(n) is much better than the O(n)*2*O(log n)
foldrWithKey/insert version. But it's still about the same as the
original 2*O(n) map fst/map snd version. With the primitive I mentioned
we could reduce the constant factor by about half.

--
Live well,
~wren