Dmitri O.Kondratiev dokondr at gmail.com
Fri Jul 11 12:07:50 EDT 2008

```I need extendable array to store and count unique vectors. I have a file
containing vectors presented as strings like:
10, 6, 80, 25, 6, 7
1, 2, 15, 17, 33, 22
21, 34, 56, 78, 91, 2
...
(BTW, what is the best library function to use to convert string of digits
into a list or array?)

There are two arrays:
1) Array of unique vectors.
The first vector red from file starts this array. Next vectors are added to
this array if they are sufficiently dissimilar from the already existing in
the array. Similarity is computed as euclidean distance between two
vectors.
2) Array of counts. Length of this array is equal to the length of array
with unique vectors. Thus every vector has a corresponding count. If new
vector is similar to some vector already existing in the first array then
corresponding count is incremented.

For example, array of unique vectors:
[[10, 6, 80, 25, 6, 7],
[1, 2, 15, 17, 33, 22],
[21, 34, 56, 78, 91, 2]]

may have a corresponding counts array:
[5,1,3]

where:
5 tells that my file has 5 vectors similar to vector [10, 6, 80, 25, 6, 7],
Only 1 vector [1, 2, 15, 17, 33, 22],
and 3 vectors similar to  [21, 34, 56, 78, 91, 2]

As long as I don't know a priory how many unique vectors my file has, as
well as how many similar to these unique others exists - I need to start
both arrays with size 0.
Next when I find a similar vector I need to increment count at corresponding
index in 'counts' array. I will also need to  access  vectors at different
index in unique array later.
That's why I need a collection both indexed and able to extend also.

I think that Data.Sequence will do for the task, don't you think?

On Fri, Jul 11, 2008 at 7:23 PM, Chaddaï Fouché <chaddai.fouche at gmail.com>
wrote:

> 2008/7/11 Dmitri O.Kondratiev <dokondr at gmail.com>:
> > How does Data.Sequence
> >
> > compares with ArrayRef for appending and accessing arrays efficiently ?
>
> It doesn't since Data.Sequence doesn't use arrays, it uses a custom
> data structure that allows to do plenty of operations with pretty good
> asymptotic complexity. Be aware though that the constants aren't that
> good and that depending on your usage, lists, array or another
> specialized data structure could be much more efficient.
>
> By comparisons, extending an array is very likely to be much more
> expensive from time to time than adding some elements to your
> Data.Sequence. On the other hand the random access would be much
> faster in an array and even the amortized cost of extending the array
> may not be much worse than the amortized cost on the Data.Sequence in
> some cases.
>
> What exactly are you trying to do ?
>
> --
> Jedaï
>

--
Dmitri O. Kondratiev
dokondr at gmail.com
http://www.geocities.com/dkondr
-------------- next part --------------
An HTML attachment was scrubbed...