[GHC] #9114: Invalid UTF8 not round-tripped correctly
GHC
ghc-devs at haskell.org
Thu May 15 09:26:48 UTC 2014
#9114: Invalid UTF8 not round-tripped correctly
-------------------------------------+------------------------------------
Reporter: nomeata | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: libraries/base | Version: 7.6.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Difficulty: Unknown
Test Case: | Blocked By:
Blocking: | Related Tickets:
-------------------------------------+------------------------------------
Description changed by tibbe:
Old description:
> As reported by Robert Bihlmeyer at http://bugs.debian.org/748125, the
> promised round-tripping of invalid UTF8 sequences in filenames through
> String does not work:
>
> ```
> $ mkdir foo
> $ touch foo/$(echo -e '\xC0\xB7.txt')
> $ ghc -e 'System.Directory.getDirectoryContents "foo" >>= print . last'
> "7.txt"
> ```
>
> The sequence 0xC8B7 is an (invalid) encoding of 37, i.e. `'7'`, so if it
> is mapped to `'7'`, no round-tripping is possible. (Other invalid byte
> sequences are round-tripped.)
New description:
As reported by Robert Bihlmeyer at http://bugs.debian.org/748125, the
promised round-tripping of invalid UTF8 sequences in filenames through
String does not work:
{{{
$ mkdir foo
$ touch foo/$(echo -e '\xC0\xB7.txt')
$ ghc -e 'System.Directory.getDirectoryContents "foo" >>= print . last'
"7.txt"
}}}
The sequence 0xC8B7 is an (invalid) encoding of 37, i.e. `'7'`, so if it
is mapped to `'7'`, no round-tripping is possible. (Other invalid byte
sequences are round-tripped.)
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/9114#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list