Fwd: Re: [commit: ghc] ghc-7.10: base: fix #10298 & #7695 (25b8478)

Edward Z. Yang ezyang at mit.edu
Fri May 29 01:23:57 UTC 2015


This commit broke the HM builds: https://phabricator.haskell.org/B4147

When I validate locally, though, it works fine.

Edward

Excerpts from git's message of 2015-05-28 18:11:07 -0700:
> Repository : ssh://git@git.haskell.org/ghc
> 
> On branch  : ghc-7.10
> Link       : http://ghc.haskell.org/trac/ghc/changeset/25b84781ed950d59c7bffb77a576d3c43a883ca9/ghc
> 
> >---------------------------------------------------------------
> 
> commit 25b84781ed950d59c7bffb77a576d3c43a883ca9
> Author: Austin Seipp <austin at well-typed.com>
> Date:   Tue May 19 04:56:40 2015 -0500
> 
>     base: fix #10298 & #7695
>     
>     Summary:
>     This applies a patch from Reid Barton and Sylvain Henry, which fix a
>     disasterous infinite loop when iconv fails to load locale files, as
>     specified in #10298.
>     
>     The fix is a bit of a hack but should be fine - for the actual reasoning
>     behind it, see `Note [Disaster and iconv]` for more info.
>     
>     In addition to this fix, we also patch up the IO Encoding utilities to
>     recognize several variations of the 'ASCII' encoding (including its
>     aliases) directly so that GHC can do conversions without iconv. This
>     allows a static binary to sit in an initramfs.
>     
>     Authored-by: Reid Barton <rwbarton at gmail.com>
>     Authored-by: Sylvain Henry <hsyl20 at gmail.com>
>     Signed-off-by: Austin Seipp <austin at well-typed.com>
>     
>     Test Plan: Eyeballed it.
>     
>     Reviewers: rwbarton, hvr
>     
>     Subscribers: bgamari, thomie
>     
>     Differential Revision: https://phabricator.haskell.org/D898
>     
>     GHC Trac Issues: #10298, #7695
>     
>     (cherry picked from commit e28462de700240288519a016d0fe44d4360d9ffd)
> 
> >---------------------------------------------------------------
> 
> 25b84781ed950d59c7bffb77a576d3c43a883ca9
>  libraries/base/GHC/IO/Encoding.hs | 14 +++++++++++++-
>  libraries/base/GHC/TopHandler.hs  | 29 ++++++++++++++++++++++++++++-
>  2 files changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/libraries/base/GHC/IO/Encoding.hs b/libraries/base/GHC/IO/Encoding.hs
> index 31683b4..014b61b 100644
> --- a/libraries/base/GHC/IO/Encoding.hs
> +++ b/libraries/base/GHC/IO/Encoding.hs
> @@ -235,7 +235,14 @@ mkTextEncoding e = case mb_coding_failure_mode of
>          _             -> Nothing
>  
>  mkTextEncoding' :: CodingFailureMode -> String -> IO TextEncoding
> -mkTextEncoding' cfm enc = case [toUpper c | c <- enc, c /= '-'] of
> +mkTextEncoding' cfm enc
> +  -- First, specifically match on ASCII encodings directly using
> +  -- several possible aliases (specified by RFC 1345 & co), which
> +  -- allows us to handle ASCII conversions without iconv at all (see
> +  -- trac #10298).
> +  | any (== enc) ansiEncNames = return (UTF8.mkUTF8 cfm)
> +  -- Otherwise, handle other encoding needs via iconv.
> +  | otherwise = case [toUpper c | c <- enc, c /= '-'] of
>      "UTF8"    -> return $ UTF8.mkUTF8 cfm
>      "UTF16"   -> return $ UTF16.mkUTF16 cfm
>      "UTF16LE" -> return $ UTF16.mkUTF16le cfm
> @@ -249,6 +256,11 @@ mkTextEncoding' cfm enc = case [toUpper c | c <- enc, c /= '-'] of
>  #else
>      _ -> Iconv.mkIconvEncoding cfm enc
>  #endif
> +  where
> +    ansiEncNames = -- ASCII aliases
> +      [ "ANSI_X3.4-1968", "iso-ir-6", "ANSI_X3.4-1986", "ISO_646.irv:1991"
> +      , "US-ASCII", "us", "IBM367", "cp367", "csASCII", "ASCII", "ISO646-US"
> +      ]
>  
>  latin1_encode :: CharBuffer -> Buffer Word8 -> IO (CharBuffer, Buffer Word8)
>  latin1_encode input output = fmap (\(_why,input',output') -> (input',output')) $ Latin1.latin1_encode input output -- unchecked, used for char8
> diff --git a/libraries/base/GHC/TopHandler.hs b/libraries/base/GHC/TopHandler.hs
> index d7c0038..e725196 100644
> --- a/libraries/base/GHC/TopHandler.hs
> +++ b/libraries/base/GHC/TopHandler.hs
> @@ -157,13 +157,40 @@ real_handler exit se = do
>             Just (ExitFailure n) -> exit n
>  
>             -- EPIPE errors received for stdout are ignored (#2699)
> -           _ -> case fromException se of
> +           _ -> catch (case fromException se of
>                  Just IOError{ ioe_type = ResourceVanished,
>                                ioe_errno = Just ioe,
>                                ioe_handle = Just hdl }
>                     | Errno ioe == ePIPE, hdl == stdout -> exit 0
>                  _ -> do reportError se
>                          exit 1
> +                ) (disasterHandler exit) -- See Note [Disaster with iconv]
> +
> +-- don't use errorBelch() directly, because we cannot call varargs functions
> +-- using the FFI.
> +foreign import ccall unsafe "HsBase.h errorBelch2"
> +   errorBelch :: CString -> CString -> IO ()
> +
> +disasterHandler :: (Int -> IO a) -> IOError -> IO a
> +disasterHandler exit _ =
> +  withCAString "%s" $ \fmt ->
> +    withCAString msgStr $ \msg ->
> +      errorBelch fmt msg >> exit 1
> +  where msgStr = "encountered an exception while trying to report an exception"
> +
> +{- Note [Disaster with iconv]
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +When using iconv, it's possible for things like iconv_open to fail in
> +restricted environments (like an initram or restricted container), but
> +when this happens the error raised inevitably calls `peekCString`,
> +which depends on the users locale, which depends on using
> +`iconv_open`... which causes an infinite loop.
> +
> +This occurrence is also known as tickets #10298 and #7695. So to work
> +around it we just set _another_ error handler and bail directly by
> +calling the RTS, without iconv at all.
> +-}
>  
>  
>  -- try to flush stdout/stderr, but don't worry if we fail
> 
--- End forwarded message ---


More information about the ghc-devs mailing list