Export lists in modules

John Meacham john at repetae.net
Thu Feb 23 20:35:40 EST 2006


On Thu, Feb 23, 2006 at 10:58:36AM +0000, Malcolm Wallace wrote:
> Some data-structures people (e.g. Chris Okasaki) are of the opinion that
> lists tend to be over-used (because they are built-in), when other
> datatypes might be much more appropriate.

While this may be true of lesser languages and compilers, for a whole
lot of stuff lists end up faster and more efficient than other data
structures. I know when data.set and data.map first appeared I started
converting things to use them and actually hurt performance in some
cases. unless you can amortize the cost of building a map/set over many
lookups with good coverage and a fair number of values then they tend to
not be worth it.

I think there are a couple reasons for this:

1. amortized bounds on cost only start mattering when your number of
elements gets big, most uses of data structures are for a small number
of elements. 

2. lazyness, when doing an 'elem' on a list, it only needs to evaluate
as much of the list as is needed to find the element. if the list is
built with a 'build' then the list never even needs to exist in memory.
complicated data structures often need to examine all valuse to build
their tree properly.

3. list lookups are O(n). building a order based structures is
O(n*log(n)). this means if you are doing less than log(n)* lookups then a
list is more efficient, perhaps signifigantly so compared to building a
set when combined with the other points. I am guilty of forgetting this
on occasion myself.

4. ghc's list deforestation is friggen awesome. lists are not so much a
data structure in haskell as a programming construct used to express
compositions of routines that work on sequences of items. This alone
makes them quite special and more useful than 'data structures'. 

incidntally, I don't think we have build/foldr rules for Set.from/toList
and Map.from/toList. I think they can help the performance of Set/Map
signifigantly.

* actually n*log(n)/(n - log(n)) which means lists are even faster for
  small numbers of lookups.

        John


-- 
John Meacham - ⑆repetae.net⑆john⑈


More information about the Haskell-prime mailing list