[Haskell-cafe] Is "take" behaving correctly?
ok
ok at cs.otago.ac.nz
Wed Sep 12 18:27:35 EDT 2007
On 12 Sep 2007, at 8:08 pm, rahn at ira.uka.de wrote:
>
>> take 1000 [1..3] still yields [1,2,3]
>
> You can think about take n as: Take as much as possible, but at
> most n elements. This behavior has some nice properties as turned
> out by others, but there are some pitfalls.
One of the very nice properties is this:
for all n >= 0, m >= 0
take m . take n = take (m+n)
drop m . drop n = drop (m+n)
And another is this:
for all xs, n >= 0
take n xs ++ drop n xs == xs
> length . take n /= const n
>
> in general, instead only
>
> length . take n `elem` map const [0..n]
What we have is
for all xs, n >= 0
n <= length xs => length (take n xs) == n
Suppose we had the "unsafe" take. Would we then have
length . take n == const n?
NO! For some xs, (length . take n) xs would be bottom,
while (const n) xs would be n. As far as I can see, the
strongest statement about length that unsafeTake would
satisfy is
for all xs, n >= 0
n <= length xs => length (unsafeTake n xs) == n
which is, mutatis mutandis, the SAME law that applies to the
`take` we already have.
> Therefore head . length n is unsafe for all n, even strict positive
> ones.
I'm assuming that should have been (head . take n).
The thing is that (head . take n) fails when and only when
(head . unsafeTake n) fails; the only difference is which
function reports the error.
> Moreover, it is easy to produce tricky code. Assume you want to
> know, whether the prefixes of length k are equal and write
>
> pref_eq k xs ys = take k xs == take k ys
>
> This seems to be a straightforward implementation with good
> properties.
Actually, no, at least not if implemented naively. (I wonder just how
clever GHC is with this code? I can't check at the moment. After
heroic efforts by my sysadmin, I was finally able to install the binary
release of GHC 6.6.1, but (a) it assumes that gcc accepts a -fwrapv
flag, which gcc 3.3.2 (TWW) does not accept, and (b) after patching
that, gcc still chokes on the output of GHC.)
The obvious question is what pref_eq k xs ys *should* do when xs
or ys is not at least k long. Does it mean "I believe xs and ys
are at least k long: if they are, check their prefixes, if they
aren't, raise an error" or does it mean "xs and ys are at least
k long and their prefixes of length k are equal"?
> Uhh, this is not what everybody expects at first glance. In
> particular, if
>
> pref_eq k xs ys == False
>
> then *not* necessarily
>
> pref_eq k xs (init ys) == False.
>
> As always, quickCheck is your friend to assure (or reject) such a
> property.
That is certainly so, but the fundamental issue in this example is
not what take does, but "don't leap into coding a function before
you know what it means". We've now seen three readings of what
pref_eq means. In particular,
pref_eq k xs ys = unsafeTake k xs == unsafeTake k ys
would NOT satisfy the law that
pref_eq k xs ys == False
=> pref_eq k xs (init ys) == False
because pref_eq 2 [1,2] [1,3] == False
but pref_eq 2 [1,2] [1] => undefined
So "fixing" take would not, in this case, produce the desired behaviour.
More information about the Haskell-Cafe
mailing list