[GHC] #15300: Unboxed Sums Crash
GHC
ghc-devs at haskell.org
Fri Jun 22 00:47:06 UTC 2018
#15300: Unboxed Sums Crash
-------------------------------------+-------------------------------------
Reporter: andrewthad | Owner: (none)
Type: task | Status: new
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 8.4.3
Keywords: UnboxedSums | Operating System: Unknown/Multiple
Architecture: | Type of failure: None/Unknown
Unknown/Multiple |
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
I've made it a little further in my experiments with unboxed tuples in the
`packed` library. However, I've run into another issue that I strongly
suspect is the result of bad behavior of unboxed tuples. To replicate this
issue (with GHC 8.4.3), do the following:
{{{
git clone https://github.com/andrewthad/packed
cd packed
cabal new-build
}}}
We use `cabal new-build` for its side effect of creating a
`.ghc.environment.xyz` file. Now, create a minimal example in the
directory called `eol.hs` with the following contents:
{{{
import Packed.Bytes.Parser (Parser)
import Data.Word
import Packed.Bytes (Bytes)
import GHC.Exts (RealWorld)
import Packed.Bytes.Stream.IO (ByteStream)
import qualified Packed.Bytes as B
import qualified Data.Char
import qualified Packed.Bytes.Parser as P
import qualified Packed.Bytes.Stream.IO as Stream
main :: IO ()
main = do
r <- runExampleParser
( do P.takeBytesUntilEndOfLineConsume
P.takeBytesUntilEndOfLineConsume
P.takeBytesUntilEndOfLineConsume
) (foldMap Stream.singleton (map charToWord8
"the\nemporium\rhas\narrived"))
print r
runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a,
Maybe String)
runExampleParser parser stream = do
P.Result mleftovers r _ <- P.parseStreamIO stream () parser
mextra <- case mleftovers of
Nothing -> return Nothing
Just (P.Leftovers chunk remainingStream) -> do
bs <- Stream.unpack remainingStream
return (Just (map word8ToChar (B.unpack chunk ++ bs)))
return (either (const Nothing) Just r,mextra)
word8ToChar :: Word8 -> Char
word8ToChar = Data.Char.chr . fromIntegral
charToWord8 :: Char -> Word8
charToWord8 = fromIntegral . Data.Char.ord
s2b :: String -> Bytes
s2b = B.pack . map charToWord8
c2w :: Char -> Word8
c2w = charToWord8
}}}
Finally, build this with `ghc -O2 eol.hs`, and then run the executable
this produces to get the following:
{{{
(Nothing,Just "\rhas\narrived")
eol: internal error: stg_ap_n_ret
(GHC version 8.4.3 for x86_64_unknown_linux)
Please report this as a GHC bug:
http://www.haskell.org/ghc/reportabug
Aborted (core dumped)
}}}
Things worth noting:
1. I think the program fails in the final GC that runs right before the
program terminates. We can see that it produces a correct result of
`(Nothing,Just "\rhas\narrived")`, but something on the heap has
definitely been corrupted.
2. This only happens with `-O2` turned on.
3. This only happens when the parser does not successfully parse its
input.
Here's some more context around this. I've been working on a parser that
uses unboxed sums instead of continuations. After #15038 was fixed,
everything had been going well. Then, I took the parser type and added two
things to it: (1) context and (2) typed errors. Context is basically like
throwing `StateT` on top and errors are like throwing `ExceptT` on top.
After this, everything in my test suite kept working except for a single
test, which now consistently crashes my test suite. So, I originally had
this:
{{{
type Bytes# = (# ByteArray#, Int#, Int# #)
type Maybe# (a :: TYPE r) = (# (# #) | a #)
type Leftovers# s = (# Bytes# , ByteStream s #)
type Result# s (r :: RuntimeRep) (a :: TYPE r) =
(# Maybe# (Leftovers# s), Maybe# a #)
newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
{ getParserLevity :: forall s.
Maybe# (Leftovers# s)
-> State# s
-> (# State# s, Result# s r a #)
}
}}}
But I changed it to this:
{{{
type Bytes# = (# ByteArray#, Int#, Int# #)
type Maybe# (a :: TYPE r) = (# (# #) | a #)
type Either# a (b :: TYPE r) = (# a | b #)
type Leftovers# s = (# Bytes# , ByteStream s #)
type Result# e c s (r :: RuntimeRep) (a :: TYPE r) =
(# Maybe# (Leftovers# s), Either# (Maybe e) a, c #)
newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
{ getParserLevity :: forall s.
c
-> Maybe# (Leftovers# s)
-> State# s
-> (# State# s, Result# e c s r a #)
}
}}}
Specifically, the function causing trouble is (as currently defined):
{{{
{-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-}
takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes#
takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #))
where
go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (#
State# s, Result# e c s BytesRep Bytes# #)
go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c
#) #)
go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream
streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13
(ByteArray arr0) of
Nothing -> case streamFunc s0 of
(# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1
Just (I# ix, W8# theByte) -> case theByte of
10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0,
stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #)
#), c #) #)
-- second case means it was 13
_ -> case ix <# (off0 +# len0 -# 1#) of
1# -> case indexWord8Array# arr0 (ix +# 1# ) of
10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# )
bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -#
off0 #) #), c #) #)
_ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0
#) #), (# Nothing | #), c #) #)
_ -> case nextNonEmpty stream0 s0 of
(# s1, m #) -> case m of
(# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton
13), Stream.empty #) #), (# Nothing | #), c #) #)
(# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case
indexWord8Array# arr1 0# of
10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #)
#), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #)
_ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes
bytes1)), stream1 #) #), (# Nothing | #), c #) #)
}}}
That's all I've got for now. If no one's able to make headway, I'll
probably come back to this and try to make a more minimal example at some
point. I won't have time to do this soon though.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15300>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list