[Haskell] ANNOUNCE: Data.CompactString 0.3 - Unicode ByteString
with different encodings
Twan van Laarhoven
twanvl at gmail.com
Sun Mar 11 15:31:35 EDT 2007
Hello all,
I would like to announce version 0.3 of my Data.CompactString library.
Data.CompactString is a wrapper around Data.ByteString that represents a
Unicode string. This new version supports different encodings, as can be
seen from the data type:
> data Encoding a => CompactString a
Currently the following encodings are supported:
- UTF-8, UTF-16 and UTF-32 (both big and little endian)
- ASCII
- ISO-8859-1 (latin1)
- A custom compact encoding
Conversion between different encodings, and between CompactStrings and
ByteStrings is possible. There are also functions to automatically
detect the encoding of files based on a byte order mark. Just this part
of CompactString could be used as an encoding library for ByteStrings.
In addition to overloaded functions like
> length :: Encoding a => CompactString a -> Int,
there are also modules Data.CompactString.UTF8,
Data.CompactString.UTF16, etc. which are restricted to a single encoding:
> length :: CompactString UTF8 -> Int
I expect that these will be more useful in most cases.
The library is now feature complete, but it has not been optimized yet.
There are also some problems with I/O functions, since it is difficult
to determine what kind of encoding should be used given a Handle.
Homepage: http://twan.home.fmf.nl/compact-string/
Haddock: http://twan.home.fmf.nl/compact-string/doc/html/
Source: darcs get http://twan.home.fmf.nl/repos/compact-string
Twan van Laarhoven
More information about the Haskell
mailing list