[Haskell-cafe] Re: ANN: bytestring-trie 0.1.1 (bugfix)

ChrisK haskell at list.mightyreason.com
Sun Jan 4 13:19:18 EST 2009


Question and suggestion:

looking at
http://hackage.haskell.org/packages/archive/bytestring-trie/0.1.1/doc/html/src/Data-Trie.html#Trie

I am questioning your choice of foldr in fromList:

> -- | Convert association list into a trie. On key conflict, values
> -- earlier in the list shadow later ones.
> fromList :: [(KeyString,a)] -> Trie a
> fromList = foldr (uncurry insert) empty

> -- | /O(1)/, The empty trie.
> {-# INLINE empty #-}
> empty :: Trie a
> empty = Empty

> -- | Insert a new key. If the key is already present, overrides the
> -- old value
> {-# INLINE insert #-}
> insert    :: KeyString -> a -> Trie a -> Trie a
> insert     = alterBy (\_ x _ -> Just x)

> -- | Generic function to alter a trie by one element with a function
> -- to resolve conflicts (or non-conflicts).
> alterBy :: (KeyString -> a -> Maybe a -> Maybe a)
>          -> KeyString -> a -> Trie a -> Trie a
> alterBy f_ q_ x_
>     | S.null q_ = mergeBy (\x y -> f_ q_ x (Just y)) (singleton q_ x_) 
>     | otherwise = go q_
>     where

> -- | /O(1)/, Is the trie empty?
> {-# INLINE null #-}
> null :: Trie a -> Bool
> null Empty = True
> null _     = False

So it looks like the reduction is fromList - uncurry insert - alterBy - null.
Let me use insert in place of uncurry insert:

fromList ( (a,1) : ( (b,2) : ( (c,3) : [] ) ) )
(a,1) `insert` ( (b,2) `insert` ( (c,3) `insert` Empty ) ) )

So fromList forces the whole call chain above to be traversed until it hits the 
Empty.  For a large input list this will force the whole list to be allocated 
before proceeding AND the call chain might overflow the allowed stack size in 
ghc.  For a large trie (which is a likely use case) this is a poor situation.

If you use foldl' then the input list is only forced one element at a time.  A 
small change to the lambda that insert passes to adjustBy will retain the same 
semantics of earlier key wins (which are an especially good idea in the foldl' 
case).

Cheers,
   Chris



More information about the Haskell-Cafe mailing list