[Haskell-cafe] Optimizing array operation?

Wed Mar 4 20:00:43 EST 2009

Dear list,

I decided to extract some common code I use in my data
analysis code and end up with some interesting pattern.

> -- | Confined array operation.  The type Arr here is actually used as
> -- a C-like array with index runs from 0 to n-1.
> type Arr = Array Int
> 
> -- | Element fetching function, no bounds checking.  Works with array
> -- with index [0..n-1]
> (!>) :: Arr e -> Int -> e
> (!>) = unsafeAt
> {-# INLINE (!>) #-}
> 
> -- | Length of the array.
> arrLength :: Arr e -> Int
> arrLength = numElements
> {-# INLINE arrLength #-}
> 
> -- | Sequentially loop over all elements in an array.
> loopArrM :: (Monad m) =>
>             (e -> m a)          -- ^ function to apply to elements
>          -> Arr e               -- ^ the array
>          -> m ()
> loopArrM = loopArrMp (\ _ -> True)
> {-# INLINE loopArrM #-}
> 
> -- | Same as loopArrM but only to those with index satisfies predicate.
> loopArrMp :: (Monad m) =>
>              (Int -> Bool)      -- ^ predicate
>           -> (e -> m a)         -- ^ function to apply to elements
>           -> Arr e              -- ^ the array
>           -> m ()
> loopArrMp p f arr = loopArrMG 0 end p f arr
>     where
>       end = arrLength arr
> {-# INLINE loopArrMp #-}
> 
> -- | Same as loopArrM but only to indices within a range.
> loopArrMr :: (Monad m) =>
>              Int                -- ^ start index
>           -> Int                -- ^ end index
>           -> (e -> m a)         -- ^ function to apply to elements
>           -> Arr e              -- ^ the array
>           -> m ()
> loopArrMr start end = loopArrMG start end (\ _ -> True)
> {-# INLINE loopArrMr #-}
> 
> -- | Generic loopArrM for indices with in a range [start, end-1] and
> -- satisfy predicate.
> loopArrMG :: (Monad m) =>
>              Int                -- ^ start index
>           -> Int                -- ^ end index
>           -> (Int -> Bool)      -- ^ predicate
>           -> (e -> m a)         -- ^ function to apply to elements
>           -> Arr e              -- ^ the array
>           -> m ()
> loopArrMG start end p f arr = go start
>     where
>       go !j | j == end  = return ()
>             | p j       = f (arr !> j) >> go (j + 1)
>             | otherwise = go (j + 1)
> {-# INLINE loopArrMG #-}

It is quite frequently to loop over Arr with various
conditions, so I wrote loopArrMG to do it.  I know a list
comprehension would just do the trick, but the extra time
and heap allocation is not what I want.

My question here is that whether there is a better data
structure available to accomplish such task?  If not, is
there a better way to write the above code?

Thanks,
Xiao-Yong
-- 
    c/*    __o/*
    <\     * (__
    */\      <