[Haskell-beginners] Again on List Manipulation

Sun Sep 12 08:39:51 EDT 2010

El dom, 12-09-2010 a las 13:57 +0200, Lorenzo Isella escribió:
> Dear All,
> First of all, thanks to the whole list for their help.
> I am still struggling with things I used to be able to do quite easily 
> (though maybe not efficiently) with other languages. I still have to get 
> used to the immutable nature of lists and the absence of for loops.
> Let us say you have two lists
> 
> l = [1,1,1,1,2,2,2,2,2,2,2,3,3,5,6,7] (already sorted)
> 
> and a list of corresponding values for every entry in l
> 
> m=  [2,2,2,2,4,4,4,4,6,6,4,4,4,10,12,14].
> Also consider the list of the unique values in l
> 
> l_un = nub l
> 
> Now, what I would like to do is to get a third list, let us call it q, 
> such that its length is equal to the length of l_un.
> On top of that, its i-th entry should be the sum of the entries of m 
> corresponding to the i-th values of l_un.
> To fix the ideas, q should be
> 
> q = [8,32, 8,10,12,14] .
> How would you do that?

First, we build a list that has the wanted correspondence between m and
l. The function 'zip' pairs the i-th element of l with the i-th element
of m:

Prelude Data.List Data.Function> zip l m
[(1,2),(1,2),(1,2),(1,2),(2,4),(2,4),(2,4),(2,4),(2,6),(2,6),(2,4),(3,4),(3,4),(5,10),(6,12),(7,14)]

then, we group by the value of the entry in l. The function 'groupBy'
works just like group, but it allows to specify the predicate by which
to group elements, instead of simply using equality (group === groupBy
(==)). Here, we want to group pairs if their first component is equal,
i.e., our grouping predicate is 

\(a,b) (c,d) -> a == c

or

\x y -> fst x == fst y

or

(==) `on` fst

this gives:

Prelude Data.List Data.Function> groupBy ((==) `on` fst) $ zip l m
[[(1,2),(1,2),(1,2),(1,2)],[(2,4),(2,4),(2,4),(2,4),(2,6),(2,6),(2,4)],[(3,4),(3,4)],[(5,10)],[(6,12)],[(7,14)]]

now, we can forget about the l-values. we are only interested in the
values of the m list. To do this, we extract the second component of
each pair, using the function snd. As these pairs are elements of list,
and we want to extract the second component of all of them, we have to 
'map' the function 'snd' over these lists. Now, these lists are
themselves elements of our list, so we have to 'map' the function 'map
snd' over it:

Prelude Data.List Data.Function> map (map snd) . groupBy ((==) `on` fst) $ zip l m
[[2,2,2,2],[4,4,4,4,6,6,4],[4,4],[10],[12],[14]]

Finally, we only have to sum up the values in these lists. The function
'sum' sums up the values in a list, and as we want to do this for all
the lists in our list, we simply map it:

Prelude Data.List Data.Function> map sum . map (map snd) . groupBy ((==) `on` fst) $ zip l m
[8,32,8,10,12,14]

The last line can be slightly simplified, because 

map f . map g === map (f . g)

to

map (sum . map snd) . groupBy ((==) `on` fst) $ zip l m

	Jürgen