[Haskell-cafe] invalid character encoding
ross at soi.city.ac.uk
ross at soi.city.ac.uk
Sat Mar 19 20:33:44 EST 2005
On Sat, Mar 19, 2005 at 07:14:25PM +0000, Ian Lynagh wrote:
> In the below, it looks like there is a bug in getDirectoryContents.
Yes, now fixed in CVS.
> Also, the error from w.hs is going to stdout, not stderr.
It's a nuisance, but noone has got around to changing it.
> Most importantly, though: is there any way to remove this file without
> doing something like an FFI import of unlink?
>
> Is there anything LC_CTYPE can be set to that will act like C/POSIX but
> accept 8-bit bytes as chars too?
en_GB.iso88591 (or indeed any .iso88591 locale) will match the old
behaviour (and the GHC behaviour).
Indeed it's possible to have filenames (under POSIX, anyway) that H98
programs can't touch (under Hugs). That's pretty much follows from
the Haskell definition FilePath = String. The other thread under this
subject has touched on the need for an (additional) API using an abstract
FilePath type.
> Now consider this e.hs:
>
> --------------------
> import IO
>
> main = do hWaitForInput stdin 10000
> putStrLn "Input is ready"
> r <- hReady stdin
> print r
> c <- hGetChar stdin
> print c
> putStrLn "Done!"
> --------------------
>
> $ { printf "\xC2\xC2\xC2\xC2\xC2\xC2\xC2"; sleep 30; } | runhugs e.hs
> Input is ready
> True
>
> Program error: <stdin>: IO.hGetChar: protocol error (invalid character encoding)
> $
>
> It takes 30 seconds for this error to be printed. This shows two issues:
> First of all, I think you should be giving an error as soon as you have
> a prefix that is the start of no character. Second, hReady now only
> guarantees hGetChar won't block on a binary mode handle, but I guess
> there is not much we can do except document that (short of some hideous
> hacks).
Yes, I don't see how to avoid this when using mbtowc() to do the
conversion: it makes no distinction between a bad byte sequence and an
incomplete one.
More information about the Haskell-Cafe
mailing list