[Haskell-cafe] Filename encoding error (was: Perform a research a
la Unix 'find')
Daniel Fischer
daniel.is.fischer at web.de
Sun Aug 22 15:00:29 EDT 2010
On Sunday 22 August 2010 19:23:03, Yves Parès wrote:
> In fact the encoding problem is more general.
>
> When I simply do 'readFile "bar/fooé"', then I'm told:
> *** Exception: bar/fooé: openFile: does not exist (No such file or
> directory)
Try
ghci> readFile (Data.ByteString.Char8.unpack
(Data.ByteString.UTF8.fromString "fooé"))
(same trick for find).
The problem is probably that readFile filePath truncates the characters in
filePath to 8 bits while the filepath on your system is UTF-8 encoded, so
you have to give a pseudo-UTF-8 encoded filepath to readFile.
At least, that's how it works here, inconvenient though it is.
>
> How am I supposed to read files whose names contains non-ASCII
> characters? (I use GHC 6.12.3 under Ubuntu 10.04 32bits)
While the inconvenience lasts (people are thinking about how to handle the
situation correctly), avoid non-ASCII characters in filepaths if possible.
> My locale is fr_FR.utf8
> For instance, with HSH:
> I have a 'bar' directory, containing a file 'fooé'
>
> run $ "find bar" :: IO [String]
> returns me : ["bar", "bar/foo*\233*"]
That one is okay, 'é' is '\233' and the Show instance for Char escapes all
characters > '\127'.
>
> and run $ "find bar -name fooé"
> returns []
Maybe the same issue, try
run $ "find bar -name foo\195\169"
>
> When I provoke an error by running:
> run $ "find fooé"
> it says :
> find: "foo*\351*": No file or directory
On the other hand, if it now says \351, which is ş, there seems to be
something else amiss.
>
> So it is not the same encoding!
More information about the Haskell-Cafe
mailing list