Proposal: Make a very slight change to the semantics of Data.List.isSuffixOf
Kim-Ee Yeoh
ky3 at atamo.com
Tue Oct 14 13:05:14 UTC 2014
David,
To paraphrase Kernighan, ascertaining correctness is twice as hard as
writing a program in the first place. So if you're as clever as possible
with your code, how will you know it's even correct?
Case in point: the infinite lists wrinkle that you started the thread with.
/At a glance/ what do the general recursion versions (yours and Milan's)
evaluate to on infinite lists? What are the benchmarks on the time it takes
for a haskell programmer to figure them out? Would they not jump at the
greater affordance of reasoning with Andreas's compositions?
Meaning and reasonableness have always been the hallmarks of core libs.
Least of all, since it's so losering to benchmark the wrong thing, see
below on how Andreas-dropl no.1 beats out David-Milan by the tiniest sliver
in short-needle-long-haystack. Like checking file extensions in pathnames
that we saw in Andreas's Agda code. Here a concrete app brings real meaning
to the timings as opposed to optimizing for vaporware. Watch them Agdaists
fall off their chairs when they hit the speed boost.
Don't mean to preach to the choir. Only layin' down the facts for the sake
of the archive that every boiled haskeller knows to switch datatypes if
isSuffixOf becomes the bottleneck.
benchmarking simple shortNeedle in shortHaystack/lib
time 176.6 ns (176.4 ns .. 177.0 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 176.6 ns (176.5 ns .. 176.9 ns)
std dev 596.3 ps (318.4 ps .. 1.096 ns)
benchmarking simple shortNeedle in shortHaystack/Milan
time 135.6 ns (135.6 ns .. 135.6 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 135.6 ns (135.6 ns .. 135.6 ns)
std dev 45.49 ps (23.88 ps .. 79.14 ps)
benchmarking simple shortNeedle in shortHaystack/Abel
time 145.6 ns (142.5 ns .. 149.5 ns)
0.996 R² (0.996 R² .. 0.997 R²)
mean 147.3 ns (144.8 ns .. 150.0 ns)
std dev 8.616 ns (8.070 ns .. 9.037 ns)
variance introduced by outliers: 76% (severely inflated)
benchmarking simple shortNeedle in longHaystack/lib
time 71.12 μs (71.08 μs .. 71.15 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 71.12 μs (71.08 μs .. 71.16 μs)
std dev 142.7 ns (115.6 ns .. 186.3 ns)
benchmarking simple shortNeedle in longHaystack/Milan
time 45.00 μs (44.99 μs .. 45.00 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 45.00 μs (45.00 μs .. 45.01 μs)
std dev 21.04 ns (15.62 ns .. 28.75 ns)
benchmarking simple shortNeedle in longHaystack/Abel
time 43.68 μs (43.68 μs .. 43.70 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 43.69 μs (43.68 μs .. 43.69 μs)
std dev 18.29 ns (12.79 ns .. 27.76 ns)
benchmarking simple longNeedle in shortHaystack/lib
time 396.3 ns (396.2 ns .. 396.6 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 396.6 ns (396.3 ns .. 397.0 ns)
std dev 1.106 ns (707.9 ps .. 1.662 ns)
benchmarking simple longNeedle in shortHaystack/Milan
time 117.9 ns (117.9 ns .. 118.0 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 117.9 ns (117.9 ns .. 117.9 ns)
std dev 49.08 ps (30.66 ps .. 75.20 ps)
benchmarking simple longNeedle in shortHaystack/Abel
time 125.6 ns (125.6 ns .. 125.6 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 125.6 ns (125.6 ns .. 125.6 ns)
std dev 35.17 ps (19.08 ps .. 58.78 ps)
benchmarking simple longNeedle in longHaystack/lib
time 71.98 μs (71.92 μs .. 72.03 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 71.93 μs (71.85 μs .. 72.00 μs)
std dev 247.4 ns (199.9 ns .. 305.3 ns)
benchmarking simple longNeedle in longHaystack/Milan
time 47.26 μs (47.24 μs .. 47.29 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 47.28 μs (47.27 μs .. 47.29 μs)
std dev 35.43 ns (28.70 ns .. 45.80 ns)
benchmarking simple longNeedle in longHaystack/Abel
time 47.44 μs (47.43 μs .. 47.45 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 47.44 μs (47.44 μs .. 47.45 μs)
std dev 16.81 ns (10.63 ns .. 28.75 ns)
benchmarking use shortNeedle in shortHaystack/lib
time 617.9 ns (616.8 ns .. 618.7 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 617.9 ns (616.9 ns .. 618.4 ns)
std dev 2.295 ns (977.3 ps .. 3.747 ns)
benchmarking use shortNeedle in shortHaystack/Milan
time 570.7 ns (570.6 ns .. 570.8 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 570.7 ns (570.6 ns .. 570.7 ns)
std dev 194.8 ps (154.1 ps .. 262.9 ps)
benchmarking use shortNeedle in shortHaystack/Abel
time 576.8 ns (575.5 ns .. 579.5 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 576.5 ns (575.7 ns .. 578.9 ns)
std dev 5.133 ns (149.4 ps .. 9.882 ns)
benchmarking use shortNeedle in longHaystack/lib
time 194.4 μs (192.0 μs .. 195.9 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 190.9 μs (190.2 μs .. 191.9 μs)
std dev 2.789 μs (1.938 μs .. 3.452 μs)
benchmarking use shortNeedle in longHaystack/Milan
time 169.3 μs (164.6 μs .. 171.8 μs)
0.997 R² (0.996 R² .. 0.998 R²)
mean 160.1 μs (158.1 μs .. 162.6 μs)
std dev 7.419 μs (5.720 μs .. 8.512 μs)
variance introduced by outliers: 46% (moderately inflated)
benchmarking use shortNeedle in longHaystack/Abel
time 166.4 μs (162.9 μs .. 168.4 μs)
0.998 R² (0.997 R² .. 0.999 R²)
mean 159.1 μs (157.5 μs .. 161.0 μs)
std dev 5.903 μs (4.471 μs .. 6.721 μs)
variance introduced by outliers: 35% (moderately inflated)
benchmarking use longNeedle in shortHaystack/lib
time 828.0 ns (827.8 ns .. 828.2 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 828.0 ns (827.9 ns .. 828.2 ns)
std dev 582.1 ps (339.9 ps .. 970.5 ps)
benchmarking use longNeedle in shortHaystack/Milan
time 550.0 ns (550.0 ns .. 550.1 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 550.0 ns (550.0 ns .. 550.1 ns)
std dev 223.1 ps (97.52 ps .. 438.8 ps)
benchmarking use longNeedle in shortHaystack/Abel
time 562.1 ns (562.1 ns .. 562.2 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 562.1 ns (562.1 ns .. 562.2 ns)
std dev 109.5 ps (74.48 ps .. 182.3 ps)
benchmarking use longNeedle in longHaystack/lib
time 195.7 μs (193.3 μs .. 197.3 μs)
0.999 R² (0.999 R² .. 1.000 R²)
mean 191.9 μs (191.1 μs .. 193.0 μs)
std dev 3.065 μs (2.169 μs .. 3.825 μs)
benchmarking use longNeedle in longHaystack/Milan
time 170.6 μs (165.7 μs .. 173.4 μs)
0.996 R² (0.995 R² .. 0.998 R²)
mean 160.8 μs (158.6 μs .. 163.5 μs)
std dev 7.928 μs (6.085 μs .. 9.063 μs)
variance introduced by outliers: 49% (moderately inflated)
benchmarking use longNeedle in longHaystack/Abel
time 170.4 μs (165.4 μs .. 173.2 μs)
0.996 R² (0.995 R² .. 0.997 R²)
mean 160.5 μs (158.3 μs .. 163.3 μs)
std dev 8.066 μs (6.200 μs .. 9.280 μs)
variance introduced by outliers: 50% (moderately inflated)
This is 64-bit on a recent i5.
I appreciate learning something new from this discussion, so to pay it
forward please find attached a self-contained lhs for your nano-tweaking
pleasure.
-- Kim-Ee
On Tue, Oct 14, 2014 at 6:02 AM, David Feuer <david.feuer at gmail.com> wrote:
> I've switched my allegiance from my own version to Milan's, because it's a
> tad faster, and also less verbose. One thing to watch out for: although I
> don't particularly like this, the current version in Data.List makes []
> `isSuffixOf` undefined = True. Unless there's a general consensus that
> this doesn't matter, we need to preserve it.
>
> I don't think Milan's version is too terribly verbose. I tested something
> very similar to your #1 that was proposed on IRC by Reid Barton, and the
> numbers just didn't look too wonderful. I don't think Milan's version is
> too terribly verbose, and I think it's clearer than your #2. As for the
> depth of caring about speed, I generally disagree with you: lists are
> actually very good for some purposes, and, moreover, even when they're
> *not*, people use them anyway and then other people wait for that code to
> run.
>
> On Mon, Oct 13, 2014 at 6:10 PM, Kim-Ee Yeoh <ky3 at atamo.com> wrote:
>
>>
>> On Tue, Oct 14, 2014 at 1:17 AM, David Feuer <david.feuer at gmail.com>
>> wrote:
>>
>>> I've done the benchmarks and the results are clear: the implementation I
>>> gave is faster than the one you gave and the one in Data.List in all cases.
>>> Yours is usually faster than the one in Data.List, but slower for very
>>> short lists.
>>
>>
>> The 2x factor you're seeing over Andreas's diminishes once we put
>> slightly more effort into an apples-to-apples comparison.
>>
>> 1. Instead of drop (length xs) ys, let's define the equivalent
>>
>> dropl :: [b] -> [a] -> [a]
>> dropl (_:xs) (_:ys) = dropl xs ys
>> dropl [] ys = ys
>> dropl xs [] = []
>>
>> i.e. dropl xs ys ~ drop (length xs) ys.
>>
>> Now with Andreas's original version:
>>
>> xs `isSuffixOfA` ys = xs == dropl (dropl xs ys) ys
>>
>> that 2x gap narrows down to 10% for most cases.
>>
>> 2. There's that fast path which you optimize for, where the needle is
>> patently too long to be in the haystack. To narrow the semantic gap, we can
>> write
>>
>> dropm :: [b] -> [a] -> Maybe [a]
>> dropm (_:xs) (_:ys) = dropm xs ys
>> dropm [] ys = Just ys
>> dropm _ [] = Nothing
>>
>> xs `isSuffixOfA` ys = maybe False id $ do
>> zs <- dropm xs ys
>> ws <- dropm zs ys -- ws = needle-sized slice of the end of haystack
>> return $ xs == ws
>>
>> Now, the long-needle-short-haystack case also becomes merely 10% off.
>>
>> I'm -1 on both your proposal and the no.2 code because it's too much
>> verbosity for uncertain gain. The benchmarks look good, but is the function
>> even the bottleneck? For folks that care deeply about speed, lists is
>> almost always the wrong datatype anyway.
>>
>> I'm weakly +1 on no.1 because it beats the current double reverse
>> definition!
>>
>> -- Kim-Ee
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/libraries/attachments/20141014/b0a9350d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: T_isSuffixOf.lhs
Type: text/x-literate-haskell
Size: 2150 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/libraries/attachments/20141014/b0a9350d/attachment-0001.lhs>
More information about the Libraries
mailing list