[commit: ghc] ghc-7.10: base: fix #10298 & #7695 (25b8478)
Thomas Miedema
thomasmiedema at gmail.com
Fri May 29 22:28:33 UTC 2015
I ran 'make accept' on those two tests (they were failing for me locally),
and now the build is green again. This was the change:
https://phabricator.haskell.org/rGHCa138fa1aa9fe2b6499d023ebff4e0fd2f0f1cac8
On Fri, May 29, 2015 at 5:35 PM, Sylvain Henry <hsyl20 at gmail.com> wrote:
> It depends on your locale, this explains the different behavior on your
> local machine.
>
> The build machine seems to be using an ASCII locale and GHC wants to print
> some Unicode characters (the tick and back-tick surrounding types in error
> messages). Before this patch, Iconv was used to do the conversion from
> Unicode to ASCII. It seems that it replaces the Unicode ticks with ASCII
> ticks (i.e. 0xE28098 in UTF-8 into 0x60 and 0xE28099 into 0x27).
>
> If you enter:
> cd ghc/testsuite/tests
> grep -rn "match expected type" **/*.stderr
>
> You can see that some .stderr have been generated with a ASCII locale and
> some others with a UTF-8 locale by looking at the ticks. Are the tests
> expecting ASCII output (e.g. driver/T2507) passing on platforms with UTF-8
> locales? The output is not equal to the expected one except if it is
> converted to ASCII before the comparison.
>
> With this patch, we don't use Iconv to convert from Unicode to ASCII
> because it may not be available in some contexts (docker containers,
> initramdisk, etc.). Instead we use the UTF-8 encoder to encode ASCII (ASCII
> characters are encoded in the same way in ASCII and in UTF-8) and we don't
> try to match Unicode only characters to ASCII ones.
>
> Solutions:
> 1) change our patch to use our method only when Iconv cannot be used.
> 2) implement the Unicode to ASCII conversion as performed by Iconv
> 3) change the locale to a UTF-8 one on the build machine ;-)
>
> Sylvain
>
>
>
>
> 2015-05-29 3:23 GMT+02:00 Edward Z. Yang <ezyang at mit.edu>:
>
>> This commit broke the HM builds: https://phabricator.haskell.org/B4147
>>
>> When I validate locally, though, it works fine.
>>
>> Edward
>>
>> Excerpts from git's message of 2015-05-28 18:11:07 -0700:
>> > Repository : ssh://git@git.haskell.org/ghc
>> >
>> > On branch : ghc-7.10
>> > Link :
>> http://ghc.haskell.org/trac/ghc/changeset/25b84781ed950d59c7bffb77a576d3c43a883ca9/ghc
>> >
>> > >---------------------------------------------------------------
>> >
>> > commit 25b84781ed950d59c7bffb77a576d3c43a883ca9
>> > Author: Austin Seipp <austin at well-typed.com>
>> > Date: Tue May 19 04:56:40 2015 -0500
>> >
>> > base: fix #10298 & #7695
>> >
>> > Summary:
>> > This applies a patch from Reid Barton and Sylvain Henry, which fix a
>> > disasterous infinite loop when iconv fails to load locale files, as
>> > specified in #10298.
>> >
>> > The fix is a bit of a hack but should be fine - for the actual
>> reasoning
>> > behind it, see `Note [Disaster and iconv]` for more info.
>> >
>> > In addition to this fix, we also patch up the IO Encoding utilities
>> to
>> > recognize several variations of the 'ASCII' encoding (including its
>> > aliases) directly so that GHC can do conversions without iconv. This
>> > allows a static binary to sit in an initramfs.
>> >
>> > Authored-by: Reid Barton <rwbarton at gmail.com>
>> > Authored-by: Sylvain Henry <hsyl20 at gmail.com>
>> > Signed-off-by: Austin Seipp <austin at well-typed.com>
>> >
>> > Test Plan: Eyeballed it.
>> >
>> > Reviewers: rwbarton, hvr
>> >
>> > Subscribers: bgamari, thomie
>> >
>> > Differential Revision: https://phabricator.haskell.org/D898
>> >
>> > GHC Trac Issues: #10298, #7695
>> >
>> > (cherry picked from commit e28462de700240288519a016d0fe44d4360d9ffd)
>> >
>> > >---------------------------------------------------------------
>> >
>> > 25b84781ed950d59c7bffb77a576d3c43a883ca9
>> > libraries/base/GHC/IO/Encoding.hs | 14 +++++++++++++-
>> > libraries/base/GHC/TopHandler.hs | 29 ++++++++++++++++++++++++++++-
>> > 2 files changed, 41 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/libraries/base/GHC/IO/Encoding.hs
>> b/libraries/base/GHC/IO/Encoding.hs
>> > index 31683b4..014b61b 100644
>> > --- a/libraries/base/GHC/IO/Encoding.hs
>> > +++ b/libraries/base/GHC/IO/Encoding.hs
>> > @@ -235,7 +235,14 @@ mkTextEncoding e = case mb_coding_failure_mode of
>> > _ -> Nothing
>> >
>> > mkTextEncoding' :: CodingFailureMode -> String -> IO TextEncoding
>> > -mkTextEncoding' cfm enc = case [toUpper c | c <- enc, c /= '-'] of
>> > +mkTextEncoding' cfm enc
>> > + -- First, specifically match on ASCII encodings directly using
>> > + -- several possible aliases (specified by RFC 1345 & co), which
>> > + -- allows us to handle ASCII conversions without iconv at all (see
>> > + -- trac #10298).
>> > + | any (== enc) ansiEncNames = return (UTF8.mkUTF8 cfm)
>> > + -- Otherwise, handle other encoding needs via iconv.
>> > + | otherwise = case [toUpper c | c <- enc, c /= '-'] of
>> > "UTF8" -> return $ UTF8.mkUTF8 cfm
>> > "UTF16" -> return $ UTF16.mkUTF16 cfm
>> > "UTF16LE" -> return $ UTF16.mkUTF16le cfm
>> > @@ -249,6 +256,11 @@ mkTextEncoding' cfm enc = case [toUpper c | c <-
>> enc, c /= '-'] of
>> > #else
>> > _ -> Iconv.mkIconvEncoding cfm enc
>> > #endif
>> > + where
>> > + ansiEncNames = -- ASCII aliases
>> > + [ "ANSI_X3.4-1968", "iso-ir-6", "ANSI_X3.4-1986",
>> "ISO_646.irv:1991"
>> > + , "US-ASCII", "us", "IBM367", "cp367", "csASCII", "ASCII",
>> "ISO646-US"
>> > + ]
>> >
>> > latin1_encode :: CharBuffer -> Buffer Word8 -> IO (CharBuffer, Buffer
>> Word8)
>> > latin1_encode input output = fmap (\(_why,input',output') ->
>> (input',output')) $ Latin1.latin1_encode input output -- unchecked, used
>> for char8
>> > diff --git a/libraries/base/GHC/TopHandler.hs
>> b/libraries/base/GHC/TopHandler.hs
>> > index d7c0038..e725196 100644
>> > --- a/libraries/base/GHC/TopHandler.hs
>> > +++ b/libraries/base/GHC/TopHandler.hs
>> > @@ -157,13 +157,40 @@ real_handler exit se = do
>> > Just (ExitFailure n) -> exit n
>> >
>> > -- EPIPE errors received for stdout are ignored (#2699)
>> > - _ -> case fromException se of
>> > + _ -> catch (case fromException se of
>> > Just IOError{ ioe_type = ResourceVanished,
>> > ioe_errno = Just ioe,
>> > ioe_handle = Just hdl }
>> > | Errno ioe == ePIPE, hdl == stdout -> exit 0
>> > _ -> do reportError se
>> > exit 1
>> > + ) (disasterHandler exit) -- See Note [Disaster with
>> iconv]
>> > +
>> > +-- don't use errorBelch() directly, because we cannot call varargs
>> functions
>> > +-- using the FFI.
>> > +foreign import ccall unsafe "HsBase.h errorBelch2"
>> > + errorBelch :: CString -> CString -> IO ()
>> > +
>> > +disasterHandler :: (Int -> IO a) -> IOError -> IO a
>> > +disasterHandler exit _ =
>> > + withCAString "%s" $ \fmt ->
>> > + withCAString msgStr $ \msg ->
>> > + errorBelch fmt msg >> exit 1
>> > + where msgStr = "encountered an exception while trying to report an
>> exception"
>> > +
>> > +{- Note [Disaster with iconv]
>> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> > +
>> > +When using iconv, it's possible for things like iconv_open to fail in
>> > +restricted environments (like an initram or restricted container), but
>> > +when this happens the error raised inevitably calls `peekCString`,
>> > +which depends on the users locale, which depends on using
>> > +`iconv_open`... which causes an infinite loop.
>> > +
>> > +This occurrence is also known as tickets #10298 and #7695. So to work
>> > +around it we just set _another_ error handler and bail directly by
>> > +calling the RTS, without iconv at all.
>> > +-}
>> >
>> >
>> > -- try to flush stdout/stderr, but don't worry if we fail
>> >
>> --- End forwarded message ---
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
>
> _______________________________________________
> ghc-devs mailing list
> ghc-devs at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20150530/d529ddf7/attachment.html>
More information about the ghc-devs
mailing list