# [Haskell-cafe] Advice needed on best way to simulate an STL vector

Christophe Poucet christophe.poucet at gmail.com
Wed Apr 19 14:59:57 EDT 2006

```Either way,

Even an STL vector has O(N) insert because it needss to shift all the
items past the current element where insertion is taking place.  If your
application is insert intensive the most ideal structure is a map.
Concerning the suggestion regarding doubling the capacity, I don't
believe that STL actually doubles the allocated size.  Most applications
that use vector-like data structures (STL Vector, Ruby Array) typically
use the fibonacci sequence for the sizes because the doubling grows too
fast.

Cheers

Cale Gibbard wrote:

>Oops, replace Array there with DiffArray.
>
>On 19/04/06, Cale Gibbard <cgibbard at gmail.com> wrote:
>
>
>>Well, you could use something like C++ does for implementing STL
>>vectors in terms of arrays -- iirc, it generally doubles the allocated
>>size each time applying an operation to the vector would make it
>>exceed its capacity (the current allocation size), and keeps track of
>>the actual number of elements stored separately. It's important to do
>>allocation which is proportional to the existing capacity, or repeated
>>insertion will become quadratic time rather than linear. So
>>essentially, some data structure like
>>
>>data Vector a = Vector { store :: Array Int a, end :: Int }
>>
>>would be a sufficient minimal way to start. When the store needs to be
>>increased, you simply create a new array with twice as many elements,
>>copy the initial elements in from the old array which is then thrown
>>away, and only update end to the position of the actual last element.
>>This is analogous to what C++ implementations of the vector class do.
>>
>>What will bite you is if you try to generalise from indices of type
>>Int to instances of Ix -- the Ix operations assume that there are
>>lower and upper bounds on indices. The upper bound of course quickly
>>becomes a problem. You could however use Enum, which has toEnum and
>>fromEnum, which are sufficient to use with the implementation in terms
>>of Ints. It could also be claimed that Int isn't always the best
>>initial type to use, and indeed I'd still feel safer with Integer
>>somehow, but then, fromEnum and toEnum use Int, and if you have more
>>than 2 billion elements in your vector at the same time, maybe you
>>should be looking at your algorithm more carefully and/or doing your
>>own low level memory allocation via FFI. :)
>>
>> - Cale
>>
>>On 19/04/06, Brian Hulley <brianh at metamilk.com> wrote:
>>
>>
>>>Hi -
>>>In C++, STL provides a vector class which behaves as an array except you can
>>>insert/delete elements from it. I'm wondering what is the best Haskell data
>>>structure to use to simulate this, either mutable or immutable.
>>>
>>>I've looked at the Array interface, but although there is the // operation
>>>to replace some elements, there does not appear to be a simple way to
>>>delete/insert elements.
>>>
>>>Ideally I'd like functions like:
>>>
>>>-- Insert new elements starting at the specified index, moving the others up
>>>to make room
>>>insert:: Array i e -> i -> [e] -> Array i e
>>>
>>>-- Delete a range of elements, moving later elements back down
>>>delete:: Array i e -> i -> i -> Array e
>>>
>>>-- Append a new element to the end of an array
>>>push_back :: Array i e -> e -> Array i e
>>>
>>>Is there an efficient way of implementing these operations in Haskell, based
>>>on arrays, or should I be using some other data structure altogether eg a
>>>list?
>>>
>>>Also, for large arrays, am I right in thinking that it is still more
>>>efficient to use immutable arrays rather than mutable arrays (because it is
>>>easier for gc to always just deal with immutable values)?
>>>
>>>Thanks, Brian.
>>>
>>>_______________________________________________
>>>
>>>
>>>
>_______________________________________________
>
>
>

```