[Haskell-cafe] [Fwd: Re: ANN: bytestring-trie 0.1.1 (bugfix)]

Sun Jan 4 22:14:21 EST 2009

-------- Original Message --------
Message-ID: <496176CF.3090708 at freegeek.org>
Date: Sun, 04 Jan 2009 21:56:15 -0500
From: wren ng thornton <wren at freegeek.org>
User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707)
MIME-Version: 1.0
To: ChrisK <haskell at list.mightyreason.com>
Subject: Re: ANN: bytestring-trie 0.1.1 (bugfix)
References: <494C8B52.2040209 at freegeek.org> 
<4960303E.8010102 at freegeek.org> <4960FDA6.3020708 at list.mightyreason.com>
In-Reply-To: <4960FDA6.3020708 at list.mightyreason.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

ChrisK wrote:
> Question and suggestion:
> 
> looking at
> http://hackage.haskell.org/packages/archive/bytestring-trie/0.1.1/doc/html/src/Data-Trie.html#Trie 
> 
> I am questioning your choice of foldr in fromList:
>
> So fromList forces the whole call chain above to be traversed until it 
> hits the Empty.  For a large input list this will force the whole list 
> to be allocated before proceeding AND the call chain might overflow the 
> allowed stack size in ghc.  For a large trie (which is a likely use 
> case) this is a poor situation.
> 
> If you use foldl' then the input list is only forced one element at a 
> time.  A small change to the lambda that insert passes to adjustBy will 
> retain the same semantics of earlier key wins (which are an especially 
> good idea in the foldl' case).

The reason for foldr is to play nice with build/foldr list fusion.
Though you're right that foldl' is often nicer when fusion isn't used.
As noted (in the code) both toList and fromList are just quick mockups
for now. Have you noticed issues with stack overflows or bad
performance? My quick tests don't see any problems:

$> cat FromListTest.hs

----------------------------------------------------------------
module Main (main) where

import qualified Data.Trie as T
import Data.Trie.Convenience (insertIfAbsent)
import Data.List             (foldl')
import qualified Data.ByteString as S
import Data.ByteString.Internal (c2w)

import Microbench
import Control.Exception     (evaluate)

getList, getList'  :: T.KeyString -> Int -> [(T.KeyString, Int)]
getList  xs n = map (\k -> (k,0)) . S.inits . S.take n $ xs
getList' xs n = map (\k -> (k,0)) . S.tails . S.take n $ xs

main :: IO ()
main  = do
     -- 100000 is large enough to trigger Microbench's stop condition,
     -- and small enough to not lock up the system in trying to create it.
     xs <- evaluate $ S.replicate 100000 (c2w 'a')

     microbench "fromListR obverse" (T.null . fromListR . getList xs)
     microbench "fromListL obverse" (T.null . fromListL . getList xs)

     microbench "fromListR reverse" (T.null . fromListR . getList' xs)
     microbench "fromListL reverse" (T.null . fromListL . getList' xs)
----------------------------------------------------------------

$> ghc --make -O2 FromListTest.hs
[1 of 5] Compiling Data.Trie.ByteStringInternal (
Data/Trie/ByteStringInternal.hs, Data/Trie/ByteStringInternal.o )
[2 of 5] Compiling Data.Trie.BitTwiddle ( Data/Trie/BitTwiddle.hs,
Data/Trie/BitTwiddle.o )
[3 of 5] Compiling Data.Trie        ( Data/Trie.hs, Data/Trie.o )
[4 of 5] Compiling Data.Trie.Convenience ( Data/Trie/Convenience.hs,
Data/Trie/Convenience.o )
[5 of 5] Compiling Main             ( FromListTest.hs, FromListTest.o )
Linking FromListTest ...

$> ./FromListTest
* fromListR obverse: ...............
   75.664ns per iteration / 13216.30 per second.
* fromListL obverse: .............
   283.380ns per iteration / 3528.84 per second.
* fromListR reverse: .............
   630.597ns per iteration / 1585.80 per second.
* fromListL reverse: ...............
   74.894ns per iteration / 13352.28 per second.

Don't mind the 4x speed difference, it's an artifact of the order of the
keys being inserted (the cost of splitting an arc vs appending to it).

*Main> let run f l n = T.null . f . l (S.replicate n (c2w 'a')) $ n
*Main>
*Main> run fromListR getList  10     (0.00  secs, 0        bytes)
*Main> run fromListR getList  100    (0.00  secs, 0        bytes)
*Main> run fromListR getList  1000   (0.01  secs, 599116   bytes)
*Main> run fromListR getList  10000  (0.47  secs, 5198576  bytes)
*Main> run fromListR getList  100000 (45.36 secs, 51067440 bytes)
*Main>
*Main> run fromListL getList  10     (0.00  secs, 0          bytes)
*Main> run fromListL getList  100    (0.00  secs, 1063168    bytes)
*Main> run fromListL getList  1000   (0.06  secs, 74587280   bytes)
*Main> run fromListL getList  10000  (10.62 secs, 7438453480 bytes)
*Main> run fromListL getList  100000 Interrupted.
*Main>
*Main> run fromListR getList' 10     (0.00  secs, 0          bytes)
*Main> run fromListR getList' 100    (0.00  secs, 1063196    bytes)
*Main> run fromListR getList' 1000   (0.09  secs, 74587208   bytes)
*Main> run fromListR getList' 10000  (18.19 secs, 7438854556 bytes)
*Main> run fromListR getList' 100000 Interrupted.
*Main>
*Main> run fromListL getList' 10     (0.00  secs, 0        bytes)
*Main> run fromListL getList' 100    (0.00  secs, 0        bytes)
*Main> run fromListL getList' 1000   (0.01  secs, 533072   bytes)
*Main> run fromListL getList' 10000  (0.46  secs, 3791240  bytes)
*Main> run fromListL getList' 100000 (44.71 secs, 38515056 bytes)

Both options seem about the same to me. The foldl' version is somewhat
better in the worst case key ordering (though it changes which ordering
that is). I could offer both as alternatives, but I think it'd be better
to work on a version that is less succeptable to variation for the order
of keys.

-- 
Live well,
~wren

-- 
Live well,
~wren