Proposal: Make a very slight change to the semantics of Data.List.isSuffixOf
David Feuer
david.feuer at gmail.com
Mon Oct 13 18:17:01 UTC 2014
I've done the benchmarks and the results are clear: the implementation I
gave is faster than the one you gave and the one in Data.List in all cases.
Yours is usually faster than the one in Data.List, but slower for very
short lists.
Test code:
needles =
[ ("shortNeedle", ".exe")
,("longNeedle", "And who by fire? Who by water? Who in the sunshine? Who
in the nighttime?")]
haystacks =
[ ("shortHaystack", "C:\\Program Files\\Windows\\Dunno.exe")
,("longHaystack", take 10000 $ map toEnum $ iterate (\x -> (x*131) .&.
0xFFFF) 7)]
impls = [("lib", L.isSuffixOf), ("Feuer", isSuffixOf), ("Abel",
isSuffixOfA)]
tests = [("simple", (\impl (needle,haystack) -> fromEnum $ needle `impl`
haystack)),
("use", (\impl (needle,haystack) ->
if needle `impl` haystack
then sum (map fromEnum haystack)
else product (map fromEnum haystack)))]
main = defaultMain
[
bgroup (unwords [fst test, fst needle, "in", fst haystack])
[bench (fst impl) $ whnf (snd test $ snd impl) (snd needle, snd
haystack)
| impl <- impls] | test <- tests, needle <- needles, haystack <-
haystacks]
Results:
benchmarking simple shortNeedle in shortHaystack/lib
time 244.6 ns (243.9 ns .. 245.6 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 245.2 ns (244.8 ns .. 246.5 ns)
std dev 2.030 ns (950.4 ps .. 4.378 ns)
benchmarking simple shortNeedle in shortHaystack/Feuer
time 234.5 ns (234.1 ns .. 235.1 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 234.6 ns (234.2 ns .. 235.3 ns)
std dev 1.628 ns (678.1 ps .. 2.758 ns)
benchmarking simple shortNeedle in shortHaystack/Abel
time 268.1 ns (267.8 ns .. 268.4 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 268.0 ns (267.8 ns .. 268.4 ns)
std dev 947.8 ps (538.1 ps .. 1.668 ns)
benchmarking simple shortNeedle in longHaystack/lib
time 100.5 μs (100.2 μs .. 100.9 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 100.8 μs (100.5 μs .. 101.4 μs)
std dev 1.511 μs (1.035 μs .. 2.095 μs)
benchmarking simple shortNeedle in longHaystack/Feuer
time 55.66 μs (55.61 μs .. 55.73 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 55.68 μs (55.63 μs .. 55.79 μs)
std dev 222.0 ns (128.7 ns .. 415.0 ns)
benchmarking simple shortNeedle in longHaystack/Abel
time 94.60 μs (94.39 μs .. 94.94 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 94.51 μs (94.44 μs .. 94.68 μs)
std dev 332.8 ns (183.6 ns .. 617.7 ns)
benchmarking simple longNeedle in shortHaystack/lib
time 480.6 ns (480.2 ns .. 481.1 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 480.6 ns (480.1 ns .. 481.6 ns)
std dev 2.217 ns (1.025 ns .. 4.032 ns)
benchmarking simple longNeedle in shortHaystack/Feuer
time 187.0 ns (186.8 ns .. 187.2 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 187.0 ns (186.8 ns .. 187.4 ns)
std dev 922.0 ps (427.3 ps .. 1.598 ns)
benchmarking simple longNeedle in shortHaystack/Abel
time 340.9 ns (339.4 ns .. 343.3 ns)
1.000 R² (0.999 R² .. 1.000 R²)
mean 345.0 ns (341.6 ns .. 351.0 ns)
std dev 14.40 ns (9.622 ns .. 20.77 ns)
variance introduced by outliers: 60% (severely inflated)
benchmarking simple longNeedle in longHaystack/lib
time 100.3 μs (100.2 μs .. 100.5 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 100.4 μs (100.3 μs .. 100.9 μs)
std dev 811.3 ns (333.5 ns .. 1.575 μs)
benchmarking simple longNeedle in longHaystack/Feuer
time 56.29 μs (56.19 μs .. 56.41 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 56.23 μs (56.19 μs .. 56.32 μs)
std dev 197.1 ns (116.3 ns .. 342.1 ns)
benchmarking simple longNeedle in longHaystack/Abel
time 94.46 μs (94.32 μs .. 94.66 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 94.49 μs (94.39 μs .. 94.70 μs)
std dev 459.8 ns (277.2 ns .. 763.7 ns)
benchmarking use shortNeedle in shortHaystack/lib
time 831.8 ns (828.4 ns .. 836.1 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 831.2 ns (829.3 ns .. 834.6 ns)
std dev 8.468 ns (5.220 ns .. 13.26 ns)
benchmarking use shortNeedle in shortHaystack/Feuer
time 819.4 ns (816.7 ns .. 822.7 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 821.2 ns (818.0 ns .. 828.2 ns)
std dev 15.89 ns (7.988 ns .. 25.90 ns)
variance introduced by outliers: 23% (moderately inflated)
benchmarking use shortNeedle in shortHaystack/Abel
time 853.5 ns (851.8 ns .. 855.4 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 852.6 ns (851.5 ns .. 855.6 ns)
std dev 5.462 ns (2.422 ns .. 10.51 ns)
benchmarking use shortNeedle in longHaystack/lib
time 261.8 μs (259.2 μs .. 264.7 μs)
1.000 R² (0.999 R² .. 1.000 R²)
mean 260.2 μs (259.5 μs .. 261.4 μs)
std dev 2.854 μs (1.438 μs .. 4.475 μs)
benchmarking use shortNeedle in longHaystack/Feuer
time 225.0 μs (221.9 μs .. 227.1 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 221.0 μs (220.2 μs .. 222.4 μs)
std dev 3.487 μs (2.598 μs .. 4.385 μs)
benchmarking use shortNeedle in longHaystack/Abel
time 244.5 μs (232.5 μs .. 254.3 μs)
0.992 R² (0.990 R² .. 0.997 R²)
mean 232.1 μs (229.4 μs .. 237.7 μs)
std dev 11.45 μs (6.248 μs .. 16.89 μs)
variance introduced by outliers: 47% (moderately inflated)
benchmarking use longNeedle in shortHaystack/lib
time 1.088 μs (1.085 μs .. 1.091 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 1.087 μs (1.086 μs .. 1.090 μs)
std dev 6.936 ns (4.270 ns .. 9.691 ns)
benchmarking use longNeedle in shortHaystack/Feuer
time 787.4 ns (785.8 ns .. 789.9 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 786.7 ns (785.9 ns .. 788.3 ns)
std dev 3.761 ns (1.408 ns .. 6.358 ns)
benchmarking use longNeedle in shortHaystack/Abel
time 930.7 ns (927.7 ns .. 934.0 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 928.3 ns (926.7 ns .. 931.3 ns)
std dev 7.241 ns (3.785 ns .. 13.45 ns)
benchmarking use longNeedle in longHaystack/lib
time 262.3 μs (257.8 μs .. 266.1 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 258.6 μs (257.7 μs .. 260.0 μs)
std dev 3.979 μs (2.350 μs .. 5.888 μs)
benchmarking use longNeedle in longHaystack/Feuer
time 224.9 μs (222.1 μs .. 226.9 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 221.2 μs (220.4 μs .. 222.3 μs)
std dev 3.168 μs (2.143 μs .. 4.010 μs)
benchmarking use longNeedle in longHaystack/Abel
time 233.1 μs (231.4 μs .. 234.5 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 231.2 μs (230.7 μs .. 232.0 μs)
std dev 2.099 μs (1.419 μs .. 2.627 μs)
On Sun, Oct 12, 2014 at 12:02 PM, Andreas Abel <abela at chalmers.se> wrote:
> If you talk about "additional memory" you assume that after `xs isSuffix
> ys`, ys is no longer needed. Is this the typical use case?
> At least not in the Agda codebase, according to my quick grep...
>
> ./Packaging/Database.hs:
> dbEntries = filter (".conf" `isSuffixOf`) filePaths
>
> ./TypeChecking/Monad/Open.hs:
> unless (ctx `isSuffixOf` ctx') $ fail $ "thing out of context (" ++ show
> ctx ++ " is not a sub context of " ++ show ctx' ++ ")"
>
> ./Interaction/Options.hs:
> isLiterate file = ".lagda" `isSuffixOf` file
>
> ./Syntax/Parser.hs:
> if "lagda" `isSuffixOf` filePath file then
>
> As I said, optimizations should be backed by benchmarks, especially when
> trading a perspicuous implementation for a more complicated one...
>
>
> On 12.10.2014 17:40, David Feuer wrote:
>
>> The haystack and the shared copy are only a needle's-length apart. The
>> first stage traverses H and N until one of them runs out, holding a copy
>> of the beginning of each. This requires at most O(min{|H|, |N|})
>> additional memory (the original needle and the needle-length prefix of
>> the haystack). Assuming we didn't run out of haystack before we ran out
>> of needle (which assumption seems generally likely), we proceed to the
>> next step, traversing H and (drop |N| H) together. During this phase we
>> hold O(|N|) additional memory: the needle itself and the needle-length
>> portion of the longer of the two haystack remnants we hold. Note that in
>> the second phase, we *don't* hold on to the beginning of the haystack,
>> because we will never need it again! Then in the final phase, when the
>> shorter haystack remnant has run out, we still have the same O(|N|)
>> memory, which is now the needle itself and the needle-length suffix of
>> the haystack, and (==) traverses them both together. Thus the total
>> additional memory residency for this algorithm is O(min {|H|, |N|}).
>> Using the length approach requires that you hold the beginning of the
>> haystack for traversal while running length. To put it another way, you
>> force the entire spine of the haystack to run length, and then start
>> from the beginning. If the haystack is produced lazily, this potentially
>> requires O(|H|) extra memory. Since you also check the length of the
>> needle, your total additional memory comes to O(max {|H|, |N|}). I
>> apologize if my horrid variable names obscured what goes on in the
>> algorithm I described.
>>
>> On Sun, Oct 12, 2014 at 10:14 AM, Andreas Abel <abela at chalmers.se
>> <mailto:abela at chalmers.se>> wrote:
>>
>> Well, you also traverse the haystack twice, in your getEndChunk
>> function you simultaneously traverse the haystack and a (shared)
>> copy of it. Why is this so much better?
>>
>> I guess without data (timings, heap-profiles) we do not get further
>> here.
>>
>> On 11.10.2014 14:47, David Feuer wrote:
>>
>> It can be, but your way traverses the entire haystack *twice*. The
>> memory needs are equivalent to the reversing version, which I
>> consider
>> unacceptable.
>>
>>
>> I do not understand this comment. reverse needs O(n) memory, length
>> O(1).
>>
>> Cheers,
>> Andreas
>>
>> On Sat, Oct 11, 2014 at 5:57 AM, Andreas Abel <abela at chalmers.se
>> <mailto:abela at chalmers.se>
>> <mailto:abela at chalmers.se <mailto:abela at chalmers.se>>> wrote:
>>
>> David, the essence of your idea seems mutually drop the
>> correct
>> number of elements from needle and hay and then compare for
>> equality. Here is a concise implementation of your idea in
>> terms of
>> drop:
>>
>> isSuffixOf :: forall a . (Eq a) => [a] -> [a] -> Bool
>> [] `isSuffixOf` _ = True
>> xs `isSuffixOf` ys = xs == drop (length (drop (length xs)
>> ys)) ys
>>
>> This can be further simplified to
>>
>> isSuffixOf :: forall a . (Eq a) => [a] -> [a] -> Bool
>> [] `isSuffixOf` _ = True
>> xs `isSuffixOf` ys = xs == drop (length ys - length xs) ys
>>
>> which is a very easy to understand program, I think,
>> without need to
>> reverse lists.
>>
>>
>>
>> _________________________________________________
>> Libraries mailing list
>> Libraries at haskell.org <mailto:Libraries at haskell.org>
>> http://www.haskell.org/__mailman/listinfo/libraries
>> <http://www.haskell.org/mailman/listinfo/libraries>
>>
>>
>>
>> --
>> Andreas Abel <>< Du bist der geliebte Mensch.
>>
>> Department of Computer Science and Engineering
>> Chalmers and Gothenburg University, Sweden
>>
>> andreas.abel at gu.se <mailto:andreas.abel at gu.se>
>> http://www2.tcs.ifi.lmu.de/~__abel/ <http://www2.tcs.ifi.lmu.de/~
>> abel/>
>>
>>
>>
>>
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://www.haskell.org/mailman/listinfo/libraries
>>
>>
>
> --
> Andreas Abel <>< Du bist der geliebte Mensch.
>
> Department of Computer Science and Engineering
> Chalmers and Gothenburg University, Sweden
>
> andreas.abel at gu.se
> http://www2.tcs.ifi.lmu.de/~abel/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20141013/cffe5e63/attachment-0001.html>
More information about the Libraries
mailing list