Proposal: Make a very slight change to the semantics of Data.List.isSuffixOf
Andreas Abel
abela at chalmers.se
Sun Oct 12 16:02:54 UTC 2014
If you talk about "additional memory" you assume that after `xs isSuffix
ys`, ys is no longer needed. Is this the typical use case?
At least not in the Agda codebase, according to my quick grep...
./Packaging/Database.hs:
dbEntries = filter (".conf" `isSuffixOf`) filePaths
./TypeChecking/Monad/Open.hs:
unless (ctx `isSuffixOf` ctx') $ fail $ "thing out of context (" ++
show ctx ++ " is not a sub context of " ++ show ctx' ++ ")"
./Interaction/Options.hs:
isLiterate file = ".lagda" `isSuffixOf` file
./Syntax/Parser.hs:
if "lagda" `isSuffixOf` filePath file then
As I said, optimizations should be backed by benchmarks, especially when
trading a perspicuous implementation for a more complicated one...
On 12.10.2014 17:40, David Feuer wrote:
> The haystack and the shared copy are only a needle's-length apart. The
> first stage traverses H and N until one of them runs out, holding a copy
> of the beginning of each. This requires at most O(min{|H|, |N|})
> additional memory (the original needle and the needle-length prefix of
> the haystack). Assuming we didn't run out of haystack before we ran out
> of needle (which assumption seems generally likely), we proceed to the
> next step, traversing H and (drop |N| H) together. During this phase we
> hold O(|N|) additional memory: the needle itself and the needle-length
> portion of the longer of the two haystack remnants we hold. Note that in
> the second phase, we *don't* hold on to the beginning of the haystack,
> because we will never need it again! Then in the final phase, when the
> shorter haystack remnant has run out, we still have the same O(|N|)
> memory, which is now the needle itself and the needle-length suffix of
> the haystack, and (==) traverses them both together. Thus the total
> additional memory residency for this algorithm is O(min {|H|, |N|}).
> Using the length approach requires that you hold the beginning of the
> haystack for traversal while running length. To put it another way, you
> force the entire spine of the haystack to run length, and then start
> from the beginning. If the haystack is produced lazily, this potentially
> requires O(|H|) extra memory. Since you also check the length of the
> needle, your total additional memory comes to O(max {|H|, |N|}). I
> apologize if my horrid variable names obscured what goes on in the
> algorithm I described.
>
> On Sun, Oct 12, 2014 at 10:14 AM, Andreas Abel <abela at chalmers.se
> <mailto:abela at chalmers.se>> wrote:
>
> Well, you also traverse the haystack twice, in your getEndChunk
> function you simultaneously traverse the haystack and a (shared)
> copy of it. Why is this so much better?
>
> I guess without data (timings, heap-profiles) we do not get further
> here.
>
> On 11.10.2014 14:47, David Feuer wrote:
>
> It can be, but your way traverses the entire haystack *twice*. The
> memory needs are equivalent to the reversing version, which I
> consider
> unacceptable.
>
>
> I do not understand this comment. reverse needs O(n) memory, length
> O(1).
>
> Cheers,
> Andreas
>
> On Sat, Oct 11, 2014 at 5:57 AM, Andreas Abel <abela at chalmers.se
> <mailto:abela at chalmers.se>
> <mailto:abela at chalmers.se <mailto:abela at chalmers.se>>> wrote:
>
> David, the essence of your idea seems mutually drop the correct
> number of elements from needle and hay and then compare for
> equality. Here is a concise implementation of your idea in
> terms of
> drop:
>
> isSuffixOf :: forall a . (Eq a) => [a] -> [a] -> Bool
> [] `isSuffixOf` _ = True
> xs `isSuffixOf` ys = xs == drop (length (drop (length xs)
> ys)) ys
>
> This can be further simplified to
>
> isSuffixOf :: forall a . (Eq a) => [a] -> [a] -> Bool
> [] `isSuffixOf` _ = True
> xs `isSuffixOf` ys = xs == drop (length ys - length xs) ys
>
> which is a very easy to understand program, I think,
> without need to
> reverse lists.
>
>
>
> _________________________________________________
> Libraries mailing list
> Libraries at haskell.org <mailto:Libraries at haskell.org>
> http://www.haskell.org/__mailman/listinfo/libraries
> <http://www.haskell.org/mailman/listinfo/libraries>
>
>
>
> --
> Andreas Abel <>< Du bist der geliebte Mensch.
>
> Department of Computer Science and Engineering
> Chalmers and Gothenburg University, Sweden
>
> andreas.abel at gu.se <mailto:andreas.abel at gu.se>
> http://www2.tcs.ifi.lmu.de/~__abel/ <http://www2.tcs.ifi.lmu.de/~abel/>
>
>
>
>
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>
--
Andreas Abel <>< Du bist der geliebte Mensch.
Department of Computer Science and Engineering
Chalmers and Gothenburg University, Sweden
andreas.abel at gu.se
http://www2.tcs.ifi.lmu.de/~abel/
More information about the Libraries
mailing list