[Haskell-beginners] What's an idiomatic Haskell solution to solve the "Maximum Subarray Problem"?

Sat Jul 16 22:40:42 UTC 2016

Hi all,

I've recently been trying to implement the "maximum subarray problem"
from [1] in Haskell. My first, naive solution looked like this:

maxSubArray :: [Int] -> [Int]
maxSubArray []         = []
maxSubArray [x]        = [x]
maxSubArray xs@(_:_:_) = maxArr (maxArr maxHd maxTl) (maxCrossingArray hd tl)
  where
    (hd,tl) = splitAt (length xs `div` 2) xs
    maxHd   = maxSubArray hd
    maxTl   = maxSubArray tl

maxCrossingArray :: [Int] -> [Int] -> [Int]
maxCrossingArray hd tl
  | null hd || null tl = error "maxArrayBetween: hd/tl empty!"
maxCrossingArray hd tl = maxHd ++ maxTl
  where
    maxHd = reverse . foldr1 maxArr . tail $ inits (reverse hd)
      -- we need to go from the center leftwards, which is why we
      -- reverse the list `hd'.
    maxTl = foldr1 maxArr . tail $ inits tl

maxArr :: [Int] -> [Int] -> [Int]
maxArr xs ys
  | sum xs > sum ys = xs
  | otherwise       = ys

While I originally thought that this should run in O(n*log n), a closer
examination revealed that the (++) as well as maxHd and maxTl
computations inside function `maxCrossingArray` are O(n^2), which makes
solving one of the provided test cases in [1] infeasible.

Hence, I rewrote the above code using Data.Array into the following:

data ArraySum = ArraySum {
   from :: Int
 , to   :: Int
 , value :: Int
 } deriving (Eq, Show)

instance Ord ArraySum where
  ArraySum _ _ v1 <= ArraySum _ _ v2 = v1 <= v2

maxSubList :: [Int] -> [Int]
maxSubList xs = take (to-from+1) . drop (from-1) $ xs
  where
    arr = array (1, length xs) [(i,v) | (i,v) <- zip [1..] xs]
    ArraySum from to val = findMaxArr (1, length xs) arr

findMaxArr :: (Int, Int) -> Array Int Int -> ArraySum
findMaxArr (start, end) arr
  | start > end  = error "findMaxArr: start > end"
  | start == end = ArraySum start end (arr ! start)
  | otherwise    = max (max hd tl) (ArraySum leftIdx rightIdx (leftVal+rightVal))
  where
    mid                  = (start + end) `div` 2
    hd                   = findMaxArr (start, mid) arr
    tl                   = findMaxArr (mid+1, end) arr
    (leftIdx, leftVal)   = snd $ findMax mid     [mid-1,mid-2..start]
    (rightIdx, rightVal) = snd $ findMax (mid+1) [mid+2,mid+3..end]
    findMax pos          = foldl' go ((pos, arr ! pos), (pos, arr ! pos))
    go ((currIdx, currSum), (maxIdx, maxSum)) idx
      | newSum >= maxSum = ((idx, newSum), (idx, newSum))
      | otherwise        = ((idx, newSum), (maxIdx, maxSum))
      where newSum = currSum + (arr ! idx)

I believe this runs in O(n*log n) now and is fast enough for the purpose
of solving the Hackerrank challenge [1].

However, I feel this second solution is not very idiomatic Haskell code
and I would prefer the clarity of the first solution over the second, if
somehow I could make it more efficient.

Therefore my question: What would be an efficient, yet idiomatic
solution to solving the "maximum subarray problem" in Haskell? (Note:
I'm aware that this problem can be solved in O(n), but I'm also happy with
idiomatic Haskell solutions running in O(n*log n))

Thanks, Dominik.

[1] https://www.hackerrank.com/challenges/maxsubarray