[Haskell-cafe] Performance question

Daniel Fischer daniel.is.fischer at web.de
Thu Mar 18 15:49:30 EDT 2010


Am Donnerstag 18 März 2010 19:59:33 schrieb Arnoldo Muller:
> Hello!
>
> I am trying to implement a binary search function that returns the index
> of an
> exact or the (index + 1) where the item should be inserted in an array
> if the item to be searched is not found (I am not trying to insert data
> in the array) .
>
> Right now, the bottleneck of my program is in binarySearch', the
> function must be called a few billion times.

If it's called often, and the arrays are 0-based and Int-indexed,

import Data.Array.Base (unsafeAt)

and replacing ! with `unsafeAt` should give a speed-up, though probably not 
terribly much. If you don't need the polymorphism and your array elements 
are unboxable, using UArray from Data.Array.Unboxed should be significantly 
faster.

>
> Do you have any ideas on how to improve the performance of this
> function?
>
> import Data.Array.IArray
>
> type IntArray a = Array Int a
>
> -- The array must be 0 indexed.
> binarySearch :: Ord a =>  a ->  IntArray a  -> Int
> binarySearch query array =
>     let (low, high) = bounds array
>     in
>        binarySearch' query array low high
>
>
> binarySearch' :: Ord a =>  a ->  IntArray a -> Int -> Int -> Int
> binarySearch' query array !low !high
>
>     | low <= high = let ! mid = low + ((high - low) `div` 2)
>
>                                  ! midVal = array !
> mid
>                                in next mid midVal
>
>     | otherwise = -(low + 1)
>
>     where next mid midVal
>
>                |  midVal < query = binarySearch' query array  (mid + 1)
>                | high midVal > query = binarySearch' query array  low 
>                | (mid - 1) otherwise = mid
>

No obvious performance killers, maybe the 'next' function costs a little 
and

let ...
in case compare midVal query of
    LT -> binarySearch' query array (mid+1) high
    EQ -> mid
    GT -> binarySearch' query array low (mid-1)

would be faster. Or moving binarySearch' from the top-level into 
binarySearch and eliminating the two static arguments may improve 
performance (I seem to remember that a static argument-transform for less 
than three or four non-function arguments can speed the code up or slow it 
down, so you'd have to test; for many arguments or function arguments it's 
pretty certain to give a speed-up, IIRC).

binarySearch query array = go low high
   where
      (low,high) = bounds array
      go !l !h
        | h < l        = -(l+1)
        | mv < query = go l (m-1)
        | mv == query = m
        | otherwise = go (m+1) h
          where
            m = l + (h-l) `quot` 2
            mv = array `unsafeAt` m

> Thank you!
>
> Arnoldo Muller



More information about the Haskell-Cafe mailing list