[Haskell] ANNOUNCE: Data.CompactString 0.3 - Unicode ByteString with different encodings

Twan van Laarhoven twanvl at gmail.com
Sun Mar 11 15:31:35 EDT 2007


Hello all,

I would like to announce version 0.3 of my Data.CompactString library. 
Data.CompactString is a wrapper around Data.ByteString that represents a 
Unicode string. This new version supports different encodings, as can be 
seen from the data type:

 > data Encoding a => CompactString a

Currently the following encodings are supported:
  - UTF-8, UTF-16 and UTF-32 (both big and little endian)
  - ASCII
  - ISO-8859-1 (latin1)
  - A custom compact encoding

Conversion between different encodings, and between CompactStrings and 
ByteStrings is possible. There are also functions to automatically 
detect the encoding of files based on a byte order mark. Just this part 
of CompactString could be used as an encoding library for ByteStrings.

In addition to overloaded functions like
 > length :: Encoding a => CompactString a -> Int,
there are also modules Data.CompactString.UTF8, 
Data.CompactString.UTF16, etc. which are restricted to a single encoding:
 > length :: CompactString UTF8 -> Int
I expect that these will be more useful in most cases.

The library is now feature complete, but it has not been optimized yet. 
There are also some problems with I/O functions, since it is difficult 
to determine what kind of encoding should be used given a Handle.

Homepage:  http://twan.home.fmf.nl/compact-string/
Haddock:   http://twan.home.fmf.nl/compact-string/doc/html/
Source:    darcs get http://twan.home.fmf.nl/repos/compact-string

Twan van Laarhoven



More information about the Haskell mailing list