[Haskell-cafe] OCaml list sees abysmal Language Shootout results
Greg Buchholz
haskell at sleepingsquirrel.org
Wed Sep 29 18:58:24 EDT 2004
Malcolm Wallace wrote:
> Graham Klyne <GK at ninebynine.org> writes:
> > I can see that this requires the original file to be kept for 3-time
> > scanning, so enough memory for the entire file will be required. Is that
> > *the* problem to which you allude? I can't see any other problem
> > here.
>
> Yes, it is the main problem.
<snip>
> In any case, for the shootout, this is patently a different algorithm
> to the one every other solution uses. All the other languages do a
> simple one-pass traversal of the file, in constant memory space. Why
> artificially handicap Haskell by insisting it do the job naively?
Just for the heck of it, I'd thought I'd try to write a naive 1-pass
version of the program. It turned out to be 4.5 times slower than the
original...
-- compiled with: ghc -O2 -ddump-simpl -fvia-c -o wc_fold wc_fold.hs
import IO
main = do file <- getContents
putStrLn (show (foldl wc (0,0,0) file))
wc :: (Int,Int,Int) -> Char -> (Int, Int, Int)
wc (l,w,c) '\n' = (l+1,w ,c+1)
wc (l,w,c) ' ' = (l ,w+1,c+1)
wc (l,w,c) x = (l ,w ,c+1)
The algorithm isn't correct (it counts spaces instead of words), but
anyone have advice for improving its performance? Is the problem
caused by the laziness of the Int's inside the tuple? Here is the
report from ghc with the '-ddump-simpl' option.
Main.wc :: (GHC.Base.Int, GHC.Base.Int, GHC.Base.Int)
-> GHC.Base.Char -> (GHC.Base.Int, GHC.Base.Int,
GHC.Base.Int)
[GlobalId]
Arity 2 __P Main.$wwc NoCafRefs Str: DmdType U(LLL)U(L)m
Main.wc = __inline_me (\ w :: (GHC.Base.Int,
GHC.Base.Int,
GHC.Base.Int)
w1 :: GHC.Base.Char ->
case w of w2 { (ww, ww1, ww2) ->
case w1 of w3 { GHC.Base.C# ww3 ->
case Main.$wwc ww ww1 ww2 ww3 of ww4 { (# ww5,
ww6, ww7 #) ->
(ww5, ww6, ww7)
}
}
})
Rec {
Main.$wlgo :: GHC.Base.Int
-> GHC.Base.Int
-> GHC.Base.Int
-> [GHC.Base.Char]
-> (# GHC.Base.Int, GHC.Base.Int, GHC.Base.Int #)
And here is the memory allocation report generated from passing
'+RTS -M900m -k250m -sstderr' to the executable...
875,150,472 bytes allocated in the heap
149,008,464 bytes copied during GC
250,845,060 bytes maximum residency (2 sample(s))
478 collections in generation 0 ( 10.44s)
2 collections in generation 1 ( 0.00s)
813 Mb total memory in use
INIT time 0.00s ( 0.00s elapsed)
MUT time 0.93s ( 1.03s elapsed)
GC time 10.44s ( 11.38s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 11.37s ( 12.41s elapsed)
%GC time 91.8% (91.7% elapsed)
Alloc rate 941,022,012 bytes per MUT second
Productivity 8.2% of total user, 7.5% of total elapsed
Finally, here's the profiling report (-pT)...
Wed Sep 29 15:21 2004 Time and Allocation Profiling Report (Final)
wc_fold +RTS -M800m -k250m -pT -RTS
total time = 1.22 secs (61 ticks @ 20 ms)
total alloc = 173,502,056 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
wc Main 60.7 64.8
main Main 39.3 35.2
individual inherited
COST CE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 1 0 0.0 0.0 100.0 100.0
main Main 146 1 39.3 35.2 100.0 100.0
wc Main 147 3048000 60.7 64.8 60.7 64.8
CAF GHC.Handle 103 3 0.0 0.0 0.0 0.0
CAF System.Posix.Internals 81 4 0.0 0.0 0.0 0.0
Any hints appreciated,
Greg Buchholz
More information about the Haskell-Cafe
mailing list