[Haskell-cafe] Boxed Mutable Arrays

Tue Dec 15 13:43:54 EST 2009

Serguey,

On Tue, Dec 15, 2009 at 1:07 PM, Serguey Zefirov <sergueyz at gmail.com> wrote:
>>>> now i see what you mean. no, i mean trivial transformation. #650 says
>>>> about slow GC. why it's slow? because once you made any update to the
>>>> array, the entire array is marked as updated and scanned on next minor GC
>>>> (which occurs after every 512 kbytes allocated, afaik). let's replace
>>>> big array (say, of 10,000 elements) with array of 100 arrays of 100
>>>> elements each. now, between minor GCs only some of arrays will be
>>>> changed and entire amount of memory to be scanned will become less
>>> Data.IntMap?
>> I want to implement a more-or-less traditional, mutable, imperative
>> hash table in the ST monad.  Hence, I'm not considering Data.IntMap
>> and other persistent tree structures for its implementation, although
>> I have thought about it.
>> The bug described in Ticket #650, AFAICS, prevents implementation of a
>> reasonable, generic hash table in Haskell.  :-(
>
> Data.IntMap is just a limit of what Bulat suggested.
>
> So what was you thoughts about Data.IntMap in mutable hashmap?
>

I have considered using Data.IntMap to implement a sort of faux hash
table, e.g., introduce a Hashable class, and then use an IntMap to
lists of Hashable.  The result would be a pure, persistent ``hash
table''.  In such a case, however, lookup/insert/delete operations
would have average complexity logarithmic in the number of buckets.

Similarly, one could use an IntMap of STRefs to lists of items, but
that is bizarre.  If the number of buckets is fixed, it would be
better to use an immutable array of STRefs to lists.

One problem with using an IntMap of STRefs to lists or an immutable
array of STRefs to lists as part of a hash table implementation is
that you get an added indirection compared to a boxed STArray of
lists, which would be bad for performance.

To reiterate my desire:  I want an efficient, generic, mutable hash
table data structure for instances where one is already working in ST.
 This data structure should be written in regular Haskell, without
resorting to the FFI.  In cases where one is already writing code
using mutable state, a good hash table implementation could perform
better in terms of time & memory than IntMap, Map, etc.

Sincerely,
Brad