[Haskell-cafe] Performance question

Daniel Fischer daniel.is.fischer at web.de
Thu Mar 18 16:57:34 EDT 2010

Am Donnerstag 18 März 2010 20:49:30 schrieb Daniel Fischer:
> Am Donnerstag 18 März 2010 19:59:33 schrieb Arnoldo Muller:
> > Hello!
> >
> > I am trying to implement a binary search function that returns the
> > index of an
> > exact or the (index + 1) where the item should be inserted in an array
> > if the item to be searched is not found (I am not trying to insert
> > data in the array) .
> >
> > Right now, the bottleneck of my program is in binarySearch', the
> > function must be called a few billion times.
> If it's called often, and the arrays are 0-based and Int-indexed,
> import Data.Array.Base (unsafeAt)
> and replacing ! with `unsafeAt` should give a speed-up, though probably
> not terribly much. If you don't need the polymorphism and your array
> elements are unboxable, using UArray from Data.Array.Unboxed should be
> significantly faster.
> > Do you have any ideas on how to improve the performance of this
> > function?
> would be faster. Or moving binarySearch' from the top-level into
> binarySearch and eliminating the two static arguments may improve
> performance (I seem to remember that a static argument-transform for
> less than three or four non-function arguments can speed the code up or
> slow it down, so you'd have to test; for many arguments or function
> arguments it's pretty certain to give a speed-up, IIRC).

Yep, for me

{-# LANGUAGE BangPatterns #-}
module SATBinSearch (binarySearch) where

import Data.Array.IArray
import Data.Array.Base (unsafeAt)
import Data.Bits

binarySearch :: Ord a => a -> Array Int a -> Int
binarySearch q a = go l h
        (l,h) = bounds a
        go !lo !hi
            | hi < lo   = -(lo+1)
            | otherwise = case compare mv q of
                            LT -> go (m+1) hi
                            EQ -> m
                            GT -> go lo (m-1)
                m = (lo .&. hi) + (lo `xor` hi) `shiftR` 1
                mv = a `unsafeAt` m

chops ~40% off the time. 'unsafeAt' alone reduces time by ~10%, the local 
loop gives the biggest speedup, and the bit-fiddling instead of

m = lo + (hi-lo) `quot` 2

something like 4%. If you don't like bit-fiddling or want your code to be 
portable to machines that don't use two's complement, the last few percent 
can be left alone.

Contrary to my expectations, however, using unboxed arrays is slower than 
straight arrays (in my tests).

> > Thank you!
> >
> > Arnoldo Muller

More information about the Haskell-Cafe mailing list