[Haskell-cafe] Splitting a list
Joe Fasel
jhf at lanl.gov
Thu Apr 22 16:02:19 EDT 2004
This is a classic greedy algorithm, much like the text-wrapping
problem.
My main suggestion would be that you're not making use of some standard
list functions that would simplify things. For example, your
runningSum is just scanl1 (+) . Similarly, splitAll should use
unfoldr. Another thing is that I would reverse the order of
arguments of splitFirst and splitAll, since curried applications
are probably more useful that way:
splitAll :: (Real a) => a -> [a] -> [[a]]
splitAll = unfoldr . split
where split _ [] = Nothing
split n xs = let (ys,zs) = break ((> n) . snd)
(zip xs (scanl1 (+) xs))
in Just (map fst ys, map fst zs)
Now, if you're concerned about all that zipping and projecting,
you can instead define split via a straightforward recursion,
or you could use a different kind of unfold that preserves the
terminating value:
unfoldrG :: (b -> Either (a,b) b) -> b -> ([a],b)
unfoldrG f = unfold
where unfold x = case f x of
Right y -> ([],y)
Left (a,y) -> let (bs,z) = unfold y
in (a:bs,z)
Here, you will define split by unfolding a pair consisting of a
running sum and remaining list.
Cheers,
--Joe
On 2004.04.21 07:42, Steve Schafer wrote:
> I have a list of integers, e.g.:
>
> [1,5,3,17,8,9]
>
> I want to split it into a pair of lists, with the criterion being that
> the sum of the elements in the first list is as large as possible, but
> not exceeding a threshold value. For example, if the threshold is 10,
> the result should be:
>
> ([1,5,3],[17,8,9])
>
> and then I want to recursively apply this process to the remainder of
> the list, with the end result being a list of lists of integers. Using
> the same list along with a threshold of 18, I would get:
>
> [[1,5,3],[17],[8,9]]
>
> I have devised a means of doing this:
>
> 1) Create an auxiliary list of integers, where the n'th element is equal
> to the sum of the first n elements of the original list.
>
> 2) Zip the auxiliary list with the original list.
>
> 3) Use span to break the list in two according to the threshold.
>
> 4) Unzip the two resulting lists and discard the auxiliary portions.
>
> 5) Repeat from step 1, operating on the tail of the list, until there's
> nothing left.
>
> Here's the code that implements this:
>
> runningSum :: (Ord a, Num a) => [a] -> [a]
> runningSum [] = []
> runningSum (i:[]) = i : []
> runningSum (i:j:js) = i : runningSum (i+j : js)
>
> zipWithSum :: (Ord a, Num a) => [a] -> [(a,a)]
> zipWithSum xs = zip (runningSum xs) xs
>
> threshold :: (Ord a, Num a) => [a] -> a -> ([(a,a)],[(a,a)])
> threshold xs t = let test x = (t >= (fst x))
> in span test (zipWithSum xs)
>
> splitFirst :: (Ord a, Num a) => [a] -> a -> ([a],[a])
> splitFirst xs t = let (ys,zs) = threshold xs t
> in (snd (unzip ys), snd (unzip zs))
>
> splitAll :: (Ord a, Num a) => [a] -> a -> [[a]]
> splitAll [] _ = []
> splitAll xs t = let (ys, zs) = splitFirst xs t
> in ys : (splitAll zs t)
>
> (One thing that's missing from this code is a check to verify that no
> single element in the list is greater than the threshold, which should
> raise an error, rather than get stuck in an infinite loop.)
>
> The algorithm as implemented works fine, but it seems overly complicated
> and not very elegant. I get the feeling that I'm missing some obvious
> simplification, but I can't find it. Any ideas?
>
> Thanks,
>
> -Steve Schafer
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
Joseph H. Fasel, Ph.D. email: jhf at lanl.gov
Systems Planning and Analysis phone: +1 505 667 7158
University of California fax: +1 505 667 2960
Los Alamos National Laboratory post: D-2 MS F609; Los Alamos, NM 87545
More information about the Haskell-Cafe
mailing list