[Haskell-cafe] Re: Haskell version of Norvig's Python Spelling
Corrector
Albert Y. C. Lai
trebla at vex.net
Sun Apr 22 01:35:50 EDT 2007
I try using WordSet = [String] (plus corresponding change in code) and
get great speedup, actually way more than 3x. There was also a memory
growth phenomenon using Set String, and replacement by [String] stops
that too, now it's constant space (constant = 20M). It is possible to
attribute part of the speedup to excellent rewrite rules in GHC
regarding lists; however, I cannot explain the memory growth when using Set.
Regarding the local WordFreq map under "train", I am shocked that ghc -O
is smart enough to notice it and perform proper sharing, and only one
copy is ever created. Nonetheless, I still decide to factor "train" into
two, one builds the WordFreq and the other queries it, which eases blame
analysis when necessary.
On the interact line, I use "tokens" to break up the input, since it's
already written (for the trainer), may as well reuse it.
When reading holmes.txt, be aware that it is in UTF-8, while GHC still
assumes ISO-8859-1. This will affect results.
I have not checked the correctness of edits1.
I am monochrom.
My modification is attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spell2.hs
Type: text/x-haskell
Size: 1988 bytes
Desc: not available
Url : http://www.haskell.org/pipermail/haskell-cafe/attachments/20070422/02d4f780/spell2-0001.bin
More information about the Haskell-Cafe
mailing list