[Haskell-cafe] Ultra-newbie Question

Mon Sep 20 18:11:30 EDT 2010

On Sep 18, 2010, at 7:51 PM, Christopher Tauss wrote:

> Hello Haskell Community -
>  
> I am a professional programmer with 11 years experience, yet I just do not seem to be able to get the hang of even simple things in Haskell.  I am trying to write a function that takes a list and returns the last n elements.
>  
> There may be a function which I can just call that does that, but I am trying to roll my own just to understand the concept.
>  
> Let's call the function n_lastn and, given a list  [1,2,3,4,5], I would like
> n_lastn 3 = [3,4,5]
>  
> Seems like it would be something like:
>  
> n_lastn:: [a]->Int->[a]
> n_lastn 1 (xs) = last(xs)
> n_lastn n (x:xs) = ????
>  
> The issue is I do not see how you can store the last elements of the list.

Part of the reason why you may be having trouble is that you are thinking
about things like "how you can store the last elements of the list".  I'm
having trouble imagining what that might mean.

Let's look at this in completely abstract terms.
A list is a sequence.
A (finite) sequence has a length, and there is a length function to find it.
We can split a sequence into pieces, and there are take and drop functions
to do it.
We have these building blocks:

	length :: [t] -> Int         --length of sequence
	take   :: Int -> [t] -> [t]  --first n elements
	drop   :: Int -> [t] -> [t]  --all BUT the first n elements

The only one of these that gives us a suffix of a list is 'drop'.
So we are going to need

	n_lastn :: Int -> [t] -> [t]
	n_lastn count list = drop ???? list

What is the first argument of drop going to be?
drop wants the number to DISCARD.
You have the number to KEEP.
The number to discard + the number to keep is the total length.
So you will discard (length list - count) elements.
And here we go with a complete answer:

	n_lastn count list = drop (length list - count) list

This is a mathematical definition about (finite) sequences.
We could write it and reason about it even if there had never been
any such thing as a computer.  It doesn't actually matter in the
least HOW a list is stored; this definition will be RIGHT.

Whether it is *efficient* does depend on how a list is stored.
There are questions like
 - how often does the sequence have to be traversed
   (with lists, determining the length involves walking along
   the whole list until the end is found, although it does not
   involve looking at the elements)
 - how much copying is done
   (with arrays, you'd have to make a new array of count elements
   and copy the last count elements of list to it, with lists you
   don't have to copy anything)

I can make this point another way.  n_lastn is a bad name, because
really, it's just the same as `take` except working from the other
end.  So we can define

	reverse_take n = reverse . take n . reverse
	reverse_drop n = reverse . drop n . reverse

"the last n items of a sequence are the first n items of its reversal, reversed"
"all but the last n items of a sequence are all but the first n
items of its reversal, reversed"

and your n_lastn is just reverse_take.  This definition does three list
traversals and two copies.

There's a trick I found very useful in Prolog, and that is to exploit
the homomorphism between lists and natural numbers, where a list can be
used to represent its own length.

Let's take `take` and `drop`.

	take (n+1) (x:xs) = x : take n xs
	take _     _      = []

	drop (n+1) (_:xs) = drop n xs
	drop _     xs     = xs

Instead of passing a natural number as first argument, we'll pass
a list, and the analogue of (n+1) is then (_:k).

	list_take, list_drop :: [any] -> [t] -> [t]

	list_take (_:k) (x:xs) = x : list_take k xs
	list_take _     _      = []

	list_drop (_:k) (_:xs) = list_drop k xs
	list_drop _     xs     = xs

(drop count list) is a list, whose length is the number of elements we
want to discard from the beginning of list.  So we can define

	reverse_take count list = list_drop (drop count list) list

and this does O(length list) work; ONE list traversal and NO copies.

	reverse_drop count list = list_take (drop count list) list

This code was tested thus:
*Main> [reverse_take n "abcd" | n <- [0..4]]
["","d","cd","bcd","abcd"]
*Main> [reverse_drop n "abcd" | n <- [0..4]]
["abcd","abc","ab","a",""]