Text in Haskell: a second proposal

09 Aug 2002 10:13:47 +0200

Ken Shan <ken@digitas.harvard.edu> writes:

> I suggest that the following Haskell types be used for the five items
> above:
> 
>  1. Word8
>  2. CChar
>  3. CodePoint
>  4. Word16
>  5. Char
> 
> On most machines, Char will be a wrapper around Word8.  (This
> contradicts the present language standard.)

Can you point out any machine where this is not the case?  One with a
Haskell implementation, or likely to have one in the future?

If not, I don't see much point, and agree with Ashley to restrict
"real" IO to [Word8].  

I like the Encoding data structure, though. 

>    data Encoding text code
>	= Encoding { encode :: [text] -> Maybe [code]
>                   , decode :: [code] -> Maybe [text] }
>
>    utf8     :: Encoding CodePoint Word8
>    iso88591 :: Encoding CodePoint Word8

Perhaps changing it to 

        data Encoding text code 
                = Encoding { encode :: text -> Maybe code, ...}

so that

        utf8 :: Encoding String [Word8]

but more importantly

        jpeg :: Encoding Image [Word8]

Perhaps [Word8], if it is the basis for IO, should be the target for
*all* Encodings?  And encoding, can it really fail?  How about:

        data Encoding text -- or rather, 'data_item' or something?
                = Encoding {encode :: text -> [Word8],
                            decode :: [Word8] -> Maybe text}

?

-kzm
-- 
If I haven't seen further, it is by standing in the footprints of giants