Announcing regex-tre-0.66 and benchmarks
Chris Kuklewicz
haskell at list.mightyreason.com
Thu Aug 10 06:41:29 EDT 2006
Donald Bruce Stewart wrote:
> simonmarhaskell:
>> Chris Kuklewicz wrote:
>>
>>> Your question has prompted me to go back into my PosixRE wrapping code
>>> and compare it to the PCRE code. I have made some changes which ought
>>> to enhance the performance of the PosixRE code. Let us see the new
>>> bechmarks on 10^6 bytes:
>>>
>>> PosixRE
>>> (102363,["bcdcd","cdc"],["bbccd","bcc"])
>>>
>>> real 1m35.429s
>>> user 1m17.862s
>>> sys 0m1.455s
>>>
>>> total is 79.317s
>>>
>>> PCRE
>>> (102363,["bcdcd","cdc"],["bbccd","bcc"])
>>>
>>> real 0m2.570s
>>> user 0m1.702s
>>> sys 0m0.219s
>>>
>>> total is 1.921s
>> So I still don't understand why PCRE should be 40 times faster than
>> PosixRE. Surely this can't be just due to differences in the underlying C
>> library?
>
> It could be. The C regex.h is pretty slow.
>
> http://shootout.alioth.debian.org/gp4/benchmark.php?test=regexdna&lang=all
>
> -- Don
And I notice c++ (g++) gets away with a 3rd party library from boost:
> // This implementation of regexdna does not use the POSIX regex
> // included with the GNU libc. Instead it uses the Boost C++ libraries
> //
> // http://www.boost.org/libs/regex/doc/index.html
> //
> // (On Debian: apt-get install libboost-regex-dev before compiling,
> // and then "g++ -O3 -lboost_regex regexdna.cc -o regexdna
> // Gentoo seems to package boost as, well, 'boost')
Which is a strange precedent.
--
Chris
More information about the Libraries
mailing list