utf8 strings: memory optimization and case-ignoring
comparision
Bulat Ziganshin
bulatz at HotPOP.com
Fri Dec 16 04:38:19 EST 2005
Hello Simon,
Thursday, December 15, 2005, 12:44:35 PM, you wrote:
>> 2. what is the most memory-efficient representaion for such strings?
>>
>> data UArray i e = UArray !i !i ByteArray#
SM> I don't know why an extra 8/16 bytes per string is that worrying - if
SM> you have so many small strings perhaps you should be sharing them via a
SM> hash table?
i also use hash table in another part of program, but in this list
(it's a basenames of files on disk) 70% of strings are unique
SM> ForeignPtr and mallocForeignPtr are the way to go these days. In GHC
SM> 6.6 these will be much faster than before.
can you please say what is a representation of ForeignPtr in 6.6? GHC
6.4.1 uses PinnedByteArray. as far as i can understand, there is
only two alternatives - either use pinned arrays and have access to C
functions which needs address of memory area, or use unpinned
ByteArray and do all processing in Haskell?
how pinned byte arrays work with garbage collector? are they allocated
in special memory blocks so they don't alternate with movable data?
can the memory, used by these arrays, be deallocated and then used
again?
--
Best regards,
Bulat mailto:bulatz at HotPOP.com
More information about the Glasgow-haskell-users
mailing list