utf8 strings: memory optimization and case-ignoring comparision
bulatz at HotPOP.com
Wed Dec 14 15:35:03 EST 2005
i use utf8-packed strings in my program and have to ask 2 questions
1. i need function to do case-ignoring comparision of such strings.
stricmp is not appropriate because it don't know about utf8. can be
the existing Unicode support in Data.Char used for these or can the
appropriate support will be added?
2. what is the most memory-efficient representaion for such strings?
now i use John Meacham's library
(http://repetae.net/john/repos/jhc/PackedString.hs) which declares:
newtype PackedString = PS (UArray Int Word8)
but this uses two Ints just to hold index bounds:
data UArray i e = UArray !i !i ByteArray#
i want to use just memory ptr and put NUL at the end of array (my
strings never contain NUL chars). but what type i must use for this ptr?
ByteArray/ByteArray#, ForeignPtr, StablePtr, Ptr?? and which function
i must use to quickly allocate memory i need? my packed strings will
be only unpacked and passed to "unsafe" C functions: stricmp, strcpy,
strcat; i plan to not use any other operations
Bulat mailto:bulatz at HotPOP.com
More information about the Glasgow-haskell-users