[Haskell-cafe] How to improve its performance ?

Daniel Fischer daniel.is.fischer at web.de
Thu Mar 18 13:16:15 EDT 2010


Am Donnerstag 18 März 2010 04:29:53 schrieb zaxis:
> The time is wasted to run combination even if use `combination (x:xs) =
> concat [(x:ys), ys] | ys <- combination xs] ' instead.
> in ghci
>
> >combination [1..20]
>
> will wait for a long time .......

Hm, really?

Prelude> :set +s                                        
Prelude> let combination [] = [[]]; combination (x:xs) = [x:ys | ys <- 
combination xs] ++ combination xs                                                              
(0.00 secs, 1818304 bytes)                                                              
Prelude> let combs [] = [[]]; combs (x:xs) = concat [[x:ys,ys] | ys <- 
combs xs]  
(0.00 secs, 2102280 bytes)                                                              
Prelude> length [1 .. 2^20]                                                       
1048576                                                                                 
(0.07 secs, 49006024 bytes)                                                             
Prelude> length $ combination [1 .. 20]
1048576                                       
(8.28 secs, 915712452 bytes)                  
Prelude> length $ combs [1 .. 20]
1048576                                
(0.78 secs, 146841964 bytes)

That's interpreted, so not optimised. Optimisation narrows the gap, speeds 
both up significantly, so put


combination :: [a] -> [[a]]
combination [] = [[]]
combination (x:xs) = [x:ys | ys <- combination xs] ++ combination xs

combs :: [a] -> [[a]]
combs [] = [[]]
combs (x:xs) = concat [[x:ys,ys] | ys <- combs xs]

mcombs :: [a] -> [[a]]
mcombs = foldr (flip (>>=) . f) [[]]
      where
        f x xs = [x:xs,xs]

in a file, compile with -O2 and load into ghci:


Prelude Parts> length $ combination [1 .. 20]
1048576
(0.16 secs, 43215220 bytes)
Prelude Parts> length $ combs [1 .. 20]
1048576
(0.05 secs, 55573436 bytes)
Prelude Parts> length $ mcombs [1 .. 20]
1048576
(0.06 secs, 55572692 bytes)
Prelude Parts> length $ combination [1 .. 24]
16777216
(3.06 secs, 674742880 bytes)
Prelude Parts> length $ combs [1 .. 24]
16777216
(0.62 secs, 881788880 bytes)
Prelude Parts> length $ mcombs [1 .. 24]
16777216
(0.62 secs, 881788956 bytes)
Prelude Parts> length [1 .. 2^24]
16777216
(0.64 secs, 675355184 bytes)

So combs and the pointfree combinator version mcombs are equally fast and 
significantly faster than combination. In fact they're as fast as a simple 
enumeration.

Now, if you actually let ghci print out the result, the printing takes a 
long time. So much that the difference in efficiency is hardly discernible 
or not at all.

>
> Daniel Fischer-4 wrote:
> > Am Donnerstag 18 März 2010 00:53:28 schrieb zaxis:
> >> import Data.List
> >>
> >> combination :: [a] -> [[a]]
> >> combination [] =  [[]]
> >> combination (x:xs) =  (map (x:) (combination xs) )++ (combination xs)
> >
> > That would normally be called sublists (or subsets, if one regards
> > lists as
> > representing a set), I think. And, apart from the order in which they
> > are generated, it's the same as Data.List.subsequences (only less
> > efficient).
> >
> >> samp = [1..100]
> >> allTwoGroup = [(x, samp\\x) | x <- combination samp]
> >>
> >> The above code is used to calculate all the two groups from sample
> >> data
> >
> > All partitions into two sublists/sets/samples.
> >
> >> ? It is very slow !
> >
> > I found it surprisingly not-slow (code compiled with -O2, as usual).
> > There are two points where you waste time.
> > First, in
> >
> > combination (x:xs)
> >
> > you calculate (combination xs) twice. If the order in which the
> > sublists come doesn't matter, it's better to do it only once:
> >
> > combination (x:xs) = concat [(x:ys), ys] | ys <- combination xs]
> >
> > Second, (\\) is slow, xs \\ ys is O(length xs * length ys).
> > Also, (\\) requires an Eq constraint. If you're willing to constrain
> > the type further, to (Ord a => [a] -> [([a],[a])]), and call it only
> > on ordered
> > lists, you can replace (\\) by the much faster difference of oredered
> > lists
> > (implementation left as an exercise for the reader).
> >
> > But you can work with unconstrained types, and faster, if you build
> > the two
> > complementary sublists at the same time.
> > The idea is,
> > -- An empty list has one partition into two complementary sublists:
> > partitions2 [] = [([],[])]
> > -- For a nonempty list (x:xs), the partitions into two complementary
> > -- sublists each have x either in the first sublist or in the second.
> > -- Each partition induces a corresponding partition of the tail, xs,
> > -- by removing x from the group in which it appears.
> > -- Conversely, every partition ox xs gives rise to two partitions
> > -- of (x:xs), by adding x to either the first or the second sublist.
> > So partitions2 (x:xs)
> >     = concat [ [(x:ys,zs),(ys,x:zs)] | (ys,zs) <- partitions2 xs ]
> >
> > We can also write the second case as
> >
> > partitions2 (x:xs) = concatMap (choice x) (partitions2 xs)
> >
> > where
> >
> > choice x (ys,zs) = [(x:ys,zs),(ys,x:zs)]
> >
> > Now it's very easy to recognise that partitions2 is a fold,
> >
> > partitions2 xs = foldr (concatMap . choice) [([],[])] xs



More information about the Haskell-Cafe mailing list