[Haskell-cafe] Re: Write Haskell as fast as C.
Andrew Coppin
andrewcoppin at btinternet.com
Sun May 18 13:04:12 EDT 2008
Andrew Coppin wrote:
> Ketil Malde wrote:
>> Ketil Malde <ketil at malde.org> writes:
>>
>>
>>>> data EvidenceCode = IAC | IUG | IFR | NAC | NR | ... deriving Show
>>>>
>>
>>
>>> Could it be that this derived read instance is somehow very
>>> inefficient?
>>>
>>
>> To answer my own question: this is exactly it, ghc derives less than
>> optimal code in this case. Rather than reiterate the case here, I did
>> a quick writeup at http://blog.malde.org/, and would be quite happy
>> about any feedback or suggestions.
>>
>
> I think you'll find the code that GHC derives for a Read instance
> handles extra whitespace and all kinds of other whatnot that you don't
> actually need in this specific case. I suspect this is what's up - but
> I don't have any hard proof for that. ;-)
I wrote three programs:
One does
data Tag =
Orange |
Lemon |
Lime |
Apple |
Banana |
Pear |
Peach
deriving Read
The other two use
get_tag :: String -> Tag
get_tag "Orange" = Orange
get_tag "Lemon" = Lemon
get_tag "Lime" = Lime
get_tag "Apple" = Apple
get_tag "Banana" = Banana
get_tag "Pear" = Pear
get_tag "Peach" = Peach
get_tag _ = error "not a tag"
and
get_tag :: String -> Tag
get_tag xs = case xs of
[] -> bad
(x:xs1) -> case x of
'A' -> case xs1 of
"pple" -> Apple
_ -> bad
'B' -> case xs1 of
"anana" -> Banana
_ -> bad
'L' -> case xs1 of
"emon" -> Lemon
"ime" -> Lime
_ -> bad
'O' -> case xs1 of
"range" -> Orange
_ -> bad
'P' -> case xs1 of
('e':'r':xs2) -> case xs2 of
"r" -> Pear
"ch" -> Peach
_ -> bad
_ -> bad
_ -> bad
bad = error "not a tag"
I wrote a driver program that reads a file of 1,000,000 tag values.
Using the first version (GHC-derived Read instance) it took about 32
seconds to process. Using the second version (just a bunch of strings to
match, no cleaverness at all) took approximately 1 second. The 3rd
version was so fast I didn't have time to see the window open before it
closed again.
Note that all of this was using plain ordinary Strings, not ByteString
or anything fancy like that.
Note also that the actual documentation for the Prelude even says that
Read is very slow. [Although it says it's slow for reading large
structures, not large numbers of trivial structures.]
None of this is really all that surprising; in the general case, a Read
instance might have to skip over spaces or parse deeply nested brackets
or any number of other things. If you know you don't need to handle all
those cases, write your own parser. It shouldn't be hard to come up with
something faster. [Altough obviously it's kinda tedious.]
More information about the Haskell-Cafe
mailing list