[Haskell-cafe] Escaping of string literals
Michael Snoyman
michael at snoyman.com
Sun May 29 05:19:10 CEST 2011
On Sun, May 29, 2011 at 4:06 AM, Yitzchak Gale <gale at sefer.org> wrote:
> Michael Snoyman wrote:
>> main = do
>> fromAddr <- unsafePackAddressLen 7 $(return $ LitE $
>> StringPrimL "123\0\&456")
>> print fromAddr
>> let fromStr = S.pack $ map (toEnum . fromEnum) $(return $ LitE
>> $ StringL "123\0\&456")
>> print fromStr
>>
>> I get the result:
>>
>> "123\192\128\&45"
>> "123\NUL456"
>
> Well, the haddocks for StringPrimL say:
> A primitive C-style string, type Addr#
>
> You obviously can't have a null byte in the middle
> of a C-style string. So GHC is replacing it with an
> invalid UTF-8 representation of a null byte, the
> best it can do under the circumstances.
> Then you just get those bytes back
> when you read them as a byte string.
>
That makes sense, thanks. I hadn't noticed that it was following UTF-8
escaping rules.
Michael
More information about the Haskell-Cafe
mailing list