[Haskell-cafe] Advice needed on best way to simulate an STL vector

Wed Apr 19 14:57:45 EDT 2006

On Wednesday 19th April 2006 18:09PM Udo Stenzel wrote:

> Brian Hulley wrote:
>> In C++, STL provides a vector class which behaves as an array except you
>> can insert/delete elements from it.

> Though you shouldn't.  If you constantly insert and delete in the middle
> of a std::vector, you're not using the right data structure.  In fact,
> std::vector is almost always wrong and std::deque would probably serve
> you better.

std::deque only gives fast insert/delete at the ends so for insert/delete in 
the middle it is still slow, and any speedup relative to std::vector might 
be offset by extra slowness in subscripting if multiple physical blocks of 
memory are used to simulate a contiguous array. I could have used a 
std::list (which is doubly linked) but then I'd lose the constant time 
random element access, so in my particular case (which was a text buffer for 
an edit control implemented as a std::vector of lines where each line 
contains some book-keeping info plus a std::vector of character info) the 
std::vector seemed to work out to be the best one to use, since there are 
more read operations (rendering, parsing etc) than write operations (user 
typing a character).

>> I'm wondering what is the best Haskell
>> data structure to use to simulate this, either mutable or immutable.

> The obvious mutable data structure is an (STRef (STArray i a)).  You can
> implement std::vector in terms of that, almost literally translating
> from C++.  If you want Haskell code that looks as ugly as C++, you
> should do exactly that.

I'm keen to learn what the Haskell way is rather than just porting my old 
C++ code directly.

> Immutable array-like thing with insertion and deletion are an
> ill-conceived idea, imho.  Every write operation would require a
> complete copy and often a reallocation, too.

It depends how many write operations there are in practice, versus how many 
times you need to read from it using array access. A reallocation (amortized 
cost O(0)) and copy (a simple memcpy) might be very fast compared to the 
time it might take for generational garbage collection to deal with the 
problem of cells in a previous generation referencing new cells as happens 
in mutable data structures. But of course it's probably not optimal.

> Instead, use some functional sequence implementation, like Finger Trees.
> Operations in the middle of the sequence incur a logarithmic cost, but
> thats better than constantly copying the whole thing around.  Being
> immutable it also results in more idiomatic code where you don't need to
> drag the ST monad around everywhere.  You might also consider a
> Finger Tree of smallish Arrays, that's about the closest equivalent to
> std::deque you can get.

Thanks, I've downloaded a paper about them from 
http://www.informatik.uni-bonn.de/~ralf/publications/FingerTrees.pdf so I'll 
see if I can understand it! Looks interesting...

Best regards, Brian.