[Haskell-beginners] Combinations of choosing n items from a list
Daniel Fischer
daniel.is.fischer at googlemail.com
Mon Nov 14 01:13:11 CET 2011
On Sunday 13 November 2011, 23:42:25, Peter Hall wrote:
> Hi all,
> This function took me a long time to write, getting my head around the
> double recursion. It returns a list of all possible sub-sets of a
> given length from a list.
>
> I have a couple of questions.
>
> 1. Can it be improved for efficiency or style?
> 2. Is there a library that already has this and other related
> functions? I assumed there would be but I couldn't find it on hoogle.
>
>
> combinations :: Int -> [a] -> [[a]]
> combinations _ [] = []
> combinations 0 _ = []
That should be
combinations 0 _ = [[]]
since there is one sublist of length 0 of any list.
And that equation should come before
combinations _ [] = []
> combinations 1 x = map (:[]) x
This equation is then superfluous.
> combinations n (x:xs) = (map (x:) $ combinations (n-1) xs) ++
> combinations n xs
Some would prefer a list-comprehension here,
combinations n (x:xs)
= [x:comb | comb <- combinations (n-1) xs]
++ combinations n xs
(indentation to prevent wrapping in mail-clients),
but I would leave the decision to personal preference (well, I would use a
list-comprehension, but I've no pains with the map. However, I submit also
map (x:) (combinations (n-1) xs) ++ combinations n xs
for consideration).
As for efficiency, yes, there's something you can do to dramatically
improve performance, if you don't want to treat infinite lists (as it is,
you'd only get sublists with identical (n-1)-long initial segment anyway,
so support for infinite lists isn't ideal).
Consider for example
combinations 25 [1 .. 25]
~> [1:c | c <- ([2:d | d <- combinations 23 [3 .. 25]]
++ combinations 24 [3 .. 25])]
++ combinations 25 [2 .. 25]
For the second part, you go through all 2^24 sublists of [2 .. 25], only to
find that all of them are too short. In the second part of the first part,
you go through all 2^23 sublists of [3 .. 25] to find all of them too
short.
Continuing the splitting, you see that
combinations 25 [1 .. 25]
goes through all 2^25 sublists of [1 .. 25], finding all of them where an
element was discarded anywhere too short.
A lot of wasted work.
If n > length xs, you know that xs has no sublists of length n, so you can
abort the calculation immediately. Since calculating the length requires
traversing the entire list, we'd rather not do that at each step, but carry
the length as a parameter.
combinations 0 _ = [[]]
combinations _ [] = []
combinations n xs
| n < 0 = []
| n == 1 = map (:[]) xs -- not needed, but a wee bit more efficient
| otherwise = helper n (length xs) xs
where
helper k l ys@(z:zs)
| k < l = [z:ws | ws <- helper (k-1) (l-1) zs]
++ helper k (l-1) zs
| k == l = [ys]
| otherwise = [] -- can only occur at the start
Not so nice, but avoids a lot of futile work.
The call to length prohibits infinite lists. You could, at some cost, treat
infinite lists per
combinations 0 _ = [[]]
combinations _ [] = []
combinations n xs@(y:ys)
| n < 0 = []
| otherwise = case drop (n-1) xs of
[ ] -> []
[_] -> [xs]
_ -> [y:c | c <- combinations (n-1) ys]
++ combinations n ys
to get the same result as before for infinite lists. The finite case would
be slower than the previous with the helper and length, but still avoid
most of the futile work when n is close to length xs.
>
> e.g.
>
> > combinations 3 [1..5]
>
> [[1,2,3],[1,2,4],[1,2,5],[1,3,4],[1,3,5],[1,4,5],[2,3,4],[2,3,5],[2,4,5]
> ,[3,4,5]]
>
> Thanks,
> Peter
More information about the Beginners
mailing list