[GHC] #15300: Unboxed Sums Crash

GHC ghc-devs at haskell.org
Fri Jun 22 00:47:06 UTC 2018


#15300: Unboxed Sums Crash
-------------------------------------+-------------------------------------
           Reporter:  andrewthad     |             Owner:  (none)
               Type:  task           |            Status:  new
           Priority:  normal         |         Milestone:  8.6.1
          Component:  Compiler       |           Version:  8.4.3
           Keywords:  UnboxedSums    |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 I've made it a little further in my experiments with unboxed tuples in the
 `packed` library. However, I've run into another issue that I strongly
 suspect is the result of bad behavior of unboxed tuples. To replicate this
 issue (with GHC 8.4.3), do the following:

 {{{
 git clone https://github.com/andrewthad/packed
 cd packed
 cabal new-build
 }}}

 We use `cabal new-build` for its side effect of creating a
 `.ghc.environment.xyz` file. Now, create a minimal example in the
 directory called `eol.hs` with the following contents:

 {{{
 import Packed.Bytes.Parser (Parser)
 import Data.Word
 import Packed.Bytes (Bytes)
 import GHC.Exts (RealWorld)
 import Packed.Bytes.Stream.IO (ByteStream)
 import qualified Packed.Bytes as B
 import qualified Data.Char
 import qualified Packed.Bytes.Parser as P
 import qualified Packed.Bytes.Stream.IO as Stream

 main :: IO ()
 main = do
   r <- runExampleParser
     ( do P.takeBytesUntilEndOfLineConsume
          P.takeBytesUntilEndOfLineConsume
          P.takeBytesUntilEndOfLineConsume
     ) (foldMap Stream.singleton (map charToWord8
 "the\nemporium\rhas\narrived"))
   print r

 runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a,
 Maybe String)
 runExampleParser parser stream = do
   P.Result mleftovers r _ <- P.parseStreamIO stream () parser
   mextra <- case mleftovers of
     Nothing -> return Nothing
     Just (P.Leftovers chunk remainingStream) -> do
       bs <- Stream.unpack remainingStream
       return (Just (map word8ToChar (B.unpack chunk ++ bs)))
   return (either (const Nothing) Just r,mextra)

 word8ToChar :: Word8 -> Char
 word8ToChar = Data.Char.chr . fromIntegral

 charToWord8 :: Char -> Word8
 charToWord8 = fromIntegral . Data.Char.ord

 s2b :: String -> Bytes
 s2b = B.pack . map charToWord8

 c2w :: Char -> Word8
 c2w = charToWord8
 }}}

 Finally, build this with `ghc -O2 eol.hs`, and then run the executable
 this produces to get the following:

 {{{
 (Nothing,Just "\rhas\narrived")
 eol: internal error: stg_ap_n_ret
     (GHC version 8.4.3 for x86_64_unknown_linux)
     Please report this as a GHC bug:
 http://www.haskell.org/ghc/reportabug
 Aborted (core dumped)
 }}}

 Things worth noting:

 1. I think the program fails in the final GC that runs right before the
 program terminates. We can see that it produces a correct result of
 `(Nothing,Just "\rhas\narrived")`, but something on the heap has
 definitely been corrupted.
 2. This only happens with `-O2` turned on.
 3. This only happens when the parser does not successfully parse its
 input.

 Here's some more context around this. I've been working on a parser that
 uses unboxed sums instead of continuations. After #15038 was fixed,
 everything had been going well. Then, I took the parser type and added two
 things to it: (1) context and (2) typed errors. Context is basically like
 throwing `StateT` on top and errors are like throwing `ExceptT` on top.
 After this, everything in my test suite kept working except for a single
 test, which now consistently crashes my test suite. So, I originally had
 this:

 {{{
 type Bytes# = (# ByteArray#, Int#, Int# #)
 type Maybe# (a :: TYPE r) = (# (# #) | a #)
 type Leftovers# s = (# Bytes# , ByteStream s #)
 type Result# s (r :: RuntimeRep) (a :: TYPE r) =
   (# Maybe# (Leftovers# s), Maybe# a #)
 newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
   { getParserLevity :: forall s.
        Maybe# (Leftovers# s)
     -> State# s
     -> (# State# s, Result# s r a #)
   }
 }}}

 But I changed it to this:

 {{{
 type Bytes# = (# ByteArray#, Int#, Int# #)
 type Maybe# (a :: TYPE r) = (# (# #) | a #)
 type Either# a (b :: TYPE r) = (# a | b #)
 type Leftovers# s = (# Bytes# , ByteStream s #)
 type Result# e c s (r :: RuntimeRep) (a :: TYPE r) =
   (# Maybe# (Leftovers# s), Either# (Maybe e) a, c #)

 newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
   { getParserLevity :: forall s.
        c
     -> Maybe# (Leftovers# s)
     -> State# s
     -> (# State# s, Result# e c s r a #)
   }
 }}}

 Specifically, the function causing trouble is (as currently defined):

 {{{
 {-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-}
 takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes#
 takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #))
 where
   go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (#
 State# s, Result# e c s BytesRep Bytes# #)
   go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c
 #) #)
   go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream
 streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13
 (ByteArray arr0) of
     Nothing -> case streamFunc s0 of
       (# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1
     Just (I# ix, W8# theByte) -> case theByte of
       10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0,
 stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #)
 #), c #) #)
       -- second case means it was 13
       _ -> case ix <# (off0 +# len0 -# 1#) of
         1# -> case indexWord8Array# arr0 (ix +# 1# ) of
           10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# )
 bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -#
 off0 #) #), c #) #)
           _ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0
 #) #), (# Nothing | #), c #) #)
         _ -> case nextNonEmpty stream0 s0 of
           (# s1, m #) -> case m of
             (# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton
 13), Stream.empty #) #), (# Nothing | #), c #) #)
             (# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case
 indexWord8Array# arr1 0# of
               10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #)
 #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #)
               _ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes
 bytes1)), stream1 #) #), (# Nothing | #), c #) #)
 }}}

 That's all I've got for now. If no one's able to make headway, I'll
 probably come back to this and try to make a more minimal example at some
 point. I won't have time to do this soon though.

-- 
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15300>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list