[Haskell-cafe] Splitting a list

Thu Apr 22 16:02:19 EDT 2004

This is a classic greedy algorithm, much like the text-wrapping
problem.

My main suggestion would be that you're not making use of some standard
list functions that would simplify things.  For example, your
runningSum is just scanl1 (+) .  Similarly, splitAll should use
unfoldr.  Another thing is that I would reverse the order of
arguments of splitFirst and splitAll, since curried applications
are probably more useful that way:

splitAll :: (Real a) => a -> [a] -> [[a]]
splitAll = unfoldr . split
	   where split _ [] = Nothing
                 split n xs = let (ys,zs) = break ((> n) . snd)
                                                  (zip xs (scanl1 (+) xs))
                              in Just (map fst ys, map fst zs)

Now, if you're concerned about all that zipping and projecting,
you can instead define split via a straightforward recursion,
or you could use a different kind of unfold that preserves the
terminating value:

unfoldrG :: (b -> Either (a,b) b) -> b -> ([a],b)
unfoldrG f = unfold
             where unfold x = case f x of
			        Right y -> ([],y)
                                Left (a,y) -> let (bs,z) = unfold y
                                              in (a:bs,z)

Here, you will define split by unfolding a pair consisting of a
running sum and remaining list.

Cheers,
--Joe

On 2004.04.21 07:42, Steve Schafer wrote:
> I have a list of integers, e.g.:
> 
>  [1,5,3,17,8,9]
> 
> I want to split it into a pair of lists, with the criterion being that
> the sum of the elements in the first list is as large as possible, but
> not exceeding a threshold value. For example, if the threshold is 10,
> the result should be:
> 
>  ([1,5,3],[17,8,9])
> 
> and then I want to recursively apply this process to the remainder of
> the list, with the end result being a list of lists of integers. Using
> the same list along with a threshold of 18, I would get:
> 
>  [[1,5,3],[17],[8,9]]
> 
> I have devised a means of doing this:
> 
> 1) Create an auxiliary list of integers, where the n'th element is equal
> to the sum of the first n elements of the original list.
> 
> 2) Zip the auxiliary list with the original list.
> 
> 3) Use span to break the list in two according to the threshold.
> 
> 4) Unzip the two resulting lists and discard the auxiliary portions.
> 
> 5) Repeat from step 1, operating on the tail of the list, until there's
> nothing left.
> 
> Here's the code that implements this:
> 
> runningSum           ::  (Ord a, Num a) => [a] -> [a]
> runningSum []        =   []
> runningSum (i:[])    =   i : []
> runningSum (i:j:js)  =   i : runningSum (i+j : js)
> 
> zipWithSum           ::  (Ord a, Num a) => [a] -> [(a,a)]
> zipWithSum xs        =   zip (runningSum xs) xs
> 
> threshold            ::  (Ord a, Num a) => [a] -> a -> ([(a,a)],[(a,a)])
> threshold xs t       =   let test x = (t >= (fst x))
>                          in span test (zipWithSum xs)
> 
> splitFirst           ::  (Ord a, Num a) => [a] -> a -> ([a],[a])
> splitFirst xs t      =   let (ys,zs) = threshold xs t
>                          in (snd (unzip ys), snd (unzip zs))
> 
> splitAll             ::  (Ord a, Num a) => [a] -> a -> [[a]]
> splitAll [] _        =   []
> splitAll xs t        =   let (ys, zs) = splitFirst xs t
>                          in ys : (splitAll zs t)
> 
> (One thing that's missing from this code is a check to verify that no
> single element in the list is greater than the threshold, which should
> raise an error, rather than get stuck in an infinite loop.)
> 
> The algorithm as implemented works fine, but it seems overly complicated
> and not very elegant. I get the feeling that I'm missing some obvious
> simplification, but I can't find it. Any ideas?
> 
> Thanks,
> 
> -Steve Schafer
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
> 
> 

Joseph H. Fasel, Ph.D.                  email: jhf at lanl.gov
Systems Planning and Analysis           phone: +1 505 667 7158
University of California                fax: +1 505 667 2960
Los Alamos National Laboratory          post: D-2 MS F609; Los Alamos, NM 87545