[Haskell-cafe] Maybe won't let me count

Viktor Dukhovni ietf-dane at dukhovni.org
Mon Mar 29 04:26:29 UTC 2021


On Sun, Mar 28, 2021 at 08:37:13PM -0700, Jon Purdy wrote:

> fmap (1 +) (whereIsBM lx)
> -- or
> (1 +) <$> whereIsBM lx

Or with bounds checks:

    succ <$> whereIsBM lx

but all these variants run out of stack on very long lists due to
failure to be tail recursive.  A more robust elemIndex implementation
would be:

    elemIndex :: (Integral i, Eq a) => a -> [a] -> Maybe i
    elemIndex e = go 0
      where
        go !_   []                   = Nothing
        go !acc (x : xs) | x == e    = Just acc
                         | otherwise = go (succ acc) xs

This is quite pedantic in generalising the index type from `Int` to
`Integral`, and then using `succ` rather than (1 +) to ensure that
overflow is detected:

    λ> :set -XBangPatterns
    λ> import Data.Int (Int8)
    λ>
    λ> :{
    λ>| elemIndex :: (Integral i, Eq a) => a -> [a] -> Maybe i
    λ>| elemIndex e = go 0
    λ>|   where
    λ>|     go !_   []                   = Nothing
    λ>|     go !acc (x : xs) | x == e    = Just acc
    λ>|                      | otherwise = go (succ acc) xs
    λ>| :}
    λ>
    λ> elemIndex 42 [0..] :: Maybe Int8
    Just 42
    λ>
    λ> elemIndex 300 [0..] :: Maybe Int8
    *** Exception: Enum.succ{Int8}: tried to take `succ' of maxBound

More typically/sloppily one would just use "Int" and (+), in the
expectation that Ints are 64 bits or more, and no list one could search
is longer than 2^63-1 elements.  Often that assumption is justified,
but the strictly correct implementation is:

    elemIndex :: Eq a => a -> [a] -> Maybe Integer
    elemIndex e = go 0
      where
        go !_   []                   = Nothing
        go !acc (x : xs) | x == e    = Just acc
                         | otherwise = go (acc + 1) xs

and the user would need to check the value for overflow before
converting to some narrower integral type.  The function is
still however /partial/, in that given the infinite list [0..]
and a negative target value it would now search forever.

Which brings us to the more fundamental observation that if you're using
a (linked) list to search through more than a handful of items you're
very much in a state of sin.  That's simply not the right data structure
for the purpose.  Linked lists should be used primarily for one shot
iteration, rather than indexing, search or repeated traversal.

Any appearance of a list index is a strong signal that the wrong data
structure is in use.  One should probably be using arrays or vectors in
these cases, but in `base` we only have `array` in `GHC.Array`, and the
interface is not friendly to new users.

Thus I applaud Michael Snoyman's quest to address the absense of a basic
array type in the `base` library.  Perhaps more users would stop abusing
lists (memoisable iterators) as an indexed store.

-- 
    Viktor.


More information about the Haskell-Cafe mailing list