[Haskell-cafe] Escaping of string literals

Michael Snoyman michael at snoyman.com
Sun May 29 05:19:10 CEST 2011


On Sun, May 29, 2011 at 4:06 AM, Yitzchak Gale <gale at sefer.org> wrote:
> Michael Snoyman wrote:
>>    main = do
>>        fromAddr <- unsafePackAddressLen 7 $(return $ LitE $
>> StringPrimL "123\0\&456")
>>        print fromAddr
>>        let fromStr = S.pack $ map (toEnum . fromEnum) $(return $ LitE
>> $ StringL "123\0\&456")
>>        print fromStr
>>
>> I get the result:
>>
>>    "123\192\128\&45"
>>    "123\NUL456"
>
> Well, the haddocks for StringPrimL say:
>  A primitive C-style string, type Addr#
>
> You obviously can't have a null byte in the middle
> of a C-style string. So GHC is replacing it with an
> invalid UTF-8 representation of a null byte, the
> best it can do under the circumstances.
> Then you just get those bytes back
> when you read them as a byte string.
>

That makes sense, thanks. I hadn't noticed that it was following UTF-8
escaping rules.

Michael



More information about the Haskell-Cafe mailing list