[GHC] #15300: Unboxed Sums Crash
GHC
ghc-devs at haskell.org
Tue Jul 3 15:21:57 UTC 2018
#15300: Unboxed Sums Crash
-------------------------------------+-------------------------------------
Reporter: andrewthad | Owner: (none)
Type: bug | Status: new
Priority: high | Milestone: 8.6.1
Component: Compiler | Version: 8.5
Resolution: | Keywords: UnboxedSums
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Description changed by bgamari:
Old description:
> I've made it a little further in my experiments with unboxed tuples in
> the `packed` library. However, I've run into another issue that I
> strongly suspect is the result of bad behavior of unboxed tuples. To
> replicate this issue (with GHC 8.4.3), do the following:
>
> {{{
> git clone https://github.com/andrewthad/packed
> cd packed
> cabal new-build
> }}}
>
> We use `cabal new-build` for its side effect of creating a
> `.ghc.environment.xyz` file. Now, create a minimal example in the
> directory called `eol.hs` with the following contents:
>
> {{{
> import Packed.Bytes.Parser (Parser)
> import Data.Word
> import Packed.Bytes (Bytes)
> import GHC.Exts (RealWorld)
> import Packed.Bytes.Stream.IO (ByteStream)
> import qualified Packed.Bytes as B
> import qualified Data.Char
> import qualified Packed.Bytes.Parser as P
> import qualified Packed.Bytes.Stream.IO as Stream
>
> main :: IO ()
> main = do
> r <- runExampleParser
> ( do P.takeBytesUntilEndOfLineConsume
> P.takeBytesUntilEndOfLineConsume
> P.takeBytesUntilEndOfLineConsume
> ) (foldMap Stream.singleton (map charToWord8
> "the\nemporium\rhas\narrived"))
> print r
>
> runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a,
> Maybe String)
> runExampleParser parser stream = do
> P.Result mleftovers r _ <- P.parseStreamIO stream () parser
> mextra <- case mleftovers of
> Nothing -> return Nothing
> Just (P.Leftovers chunk remainingStream) -> do
> bs <- Stream.unpack remainingStream
> return (Just (map word8ToChar (B.unpack chunk ++ bs)))
> return (either (const Nothing) Just r,mextra)
>
> word8ToChar :: Word8 -> Char
> word8ToChar = Data.Char.chr . fromIntegral
>
> charToWord8 :: Char -> Word8
> charToWord8 = fromIntegral . Data.Char.ord
>
> s2b :: String -> Bytes
> s2b = B.pack . map charToWord8
>
> c2w :: Char -> Word8
> c2w = charToWord8
> }}}
>
> Finally, build this with `ghc -O2 eol.hs`, and then run the executable
> this produces to get the following:
>
> {{{
> (Nothing,Just "\rhas\narrived")
> eol: internal error: stg_ap_n_ret
> (GHC version 8.4.3 for x86_64_unknown_linux)
> Please report this as a GHC bug:
> http://www.haskell.org/ghc/reportabug
> Aborted (core dumped)
> }}}
>
> Things worth noting:
>
> 1. I think the program fails in the final GC that runs right before the
> program terminates. We can see that it produces a correct result of
> `(Nothing,Just "\rhas\narrived")`, but something on the heap has
> definitely been corrupted.
> 2. This only happens with `-O2` turned on.
> 3. This only happens when the parser does not successfully parse its
> input.
>
> Here's some more context around this. I've been working on a parser that
> uses unboxed sums instead of continuations. After #15038 was fixed,
> everything had been going well. Then, I took the parser type and added
> two things to it: (1) context and (2) typed errors. Context is basically
> like throwing `StateT` on top and errors are like throwing `ExceptT` on
> top. After this, everything in my test suite kept working except for a
> single test, which now consistently crashes my test suite. So, I
> originally had this:
>
> {{{
> type Bytes# = (# ByteArray#, Int#, Int# #)
> type Maybe# (a :: TYPE r) = (# (# #) | a #)
> type Leftovers# s = (# Bytes# , ByteStream s #)
> type Result# s (r :: RuntimeRep) (a :: TYPE r) =
> (# Maybe# (Leftovers# s), Maybe# a #)
> newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
> { getParserLevity :: forall s.
> Maybe# (Leftovers# s)
> -> State# s
> -> (# State# s, Result# s r a #)
> }
> }}}
>
> But I changed it to this:
>
> {{{
> type Bytes# = (# ByteArray#, Int#, Int# #)
> type Maybe# (a :: TYPE r) = (# (# #) | a #)
> type Either# a (b :: TYPE r) = (# a | b #)
> type Leftovers# s = (# Bytes# , ByteStream s #)
> type Result# e c s (r :: RuntimeRep) (a :: TYPE r) =
> (# Maybe# (Leftovers# s), Either# (Maybe e) a, c #)
>
> newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
> { getParserLevity :: forall s.
> c
> -> Maybe# (Leftovers# s)
> -> State# s
> -> (# State# s, Result# e c s r a #)
> }
> }}}
>
> Specifically, the function causing trouble is (as currently defined):
>
> {{{
> {-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-}
> takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes#
> takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #))
> where
> go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (#
> State# s, Result# e c s BytesRep Bytes# #)
> go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c
> #) #)
> go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #),
> !stream0@(ByteStream streamFunc) #) #) s0 = case BAW.findAnyByte2 (I#
> off0) (I# len0) 10 13 (ByteArray arr0) of
> Nothing -> case streamFunc s0 of
> (# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1
> Just (I# ix, W8# theByte) -> case theByte of
> 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0,
> stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #)
> #), c #) #)
> -- second case means it was 13
> _ -> case ix <# (off0 +# len0 -# 1#) of
> 1# -> case indexWord8Array# arr0 (ix +# 1# ) of
> 10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# )
> bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -#
> off0 #) #), c #) #)
> _ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0
> #) #), (# Nothing | #), c #) #)
> _ -> case nextNonEmpty stream0 s0 of
> (# s1, m #) -> case m of
> (# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton
> 13), Stream.empty #) #), (# Nothing | #), c #) #)
> (# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case
> indexWord8Array# arr1 0# of
> 10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #)
> #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #)
> _ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes
> bytes1)), stream1 #) #), (# Nothing | #), c #) #)
> }}}
>
> That's all I've got for now. If no one's able to make headway, I'll
> probably come back to this and try to make a more minimal example at some
> point. I won't have time to do this soon though.
New description:
I've made it a little further in my experiments with unboxed tuples in the
`packed` library. However, I've run into another issue that I strongly
suspect is the result of bad behavior of unboxed tuples. To replicate this
issue (with GHC 8.4.3), do the following:
{{{
git clone https://github.com/andrewthad/packed
cd packed
cabal new-build
}}}
We use `cabal new-build` for its side effect of creating a
`.ghc.environment.xyz` file. Now, create a minimal example in the
directory called `eol.hs` with the following contents:
{{{#!hs
import Packed.Bytes.Parser (Parser)
import Data.Word
import Packed.Bytes (Bytes)
import GHC.Exts (RealWorld)
import Packed.Bytes.Stream.IO (ByteStream)
import qualified Packed.Bytes as B
import qualified Data.Char
import qualified Packed.Bytes.Parser as P
import qualified Packed.Bytes.Stream.IO as Stream
main :: IO ()
main = do
r <- runExampleParser
( do P.takeBytesUntilEndOfLineConsume
P.takeBytesUntilEndOfLineConsume
P.takeBytesUntilEndOfLineConsume
) (foldMap Stream.singleton (map charToWord8
"the\nemporium\rhas\narrived"))
print r
runExampleParser :: Parser e () a -> ByteStream RealWorld -> IO (Maybe a,
Maybe String)
runExampleParser parser stream = do
P.Result mleftovers r _ <- P.parseStreamIO stream () parser
mextra <- case mleftovers of
Nothing -> return Nothing
Just (P.Leftovers chunk remainingStream) -> do
bs <- Stream.unpack remainingStream
return (Just (map word8ToChar (B.unpack chunk ++ bs)))
return (either (const Nothing) Just r,mextra)
word8ToChar :: Word8 -> Char
word8ToChar = Data.Char.chr . fromIntegral
charToWord8 :: Char -> Word8
charToWord8 = fromIntegral . Data.Char.ord
s2b :: String -> Bytes
s2b = B.pack . map charToWord8
c2w :: Char -> Word8
c2w = charToWord8
}}}
Finally, build this with `ghc -O2 eol.hs`, and then run the executable
this produces to get the following:
{{{
(Nothing,Just "\rhas\narrived")
eol: internal error: stg_ap_n_ret
(GHC version 8.4.3 for x86_64_unknown_linux)
Please report this as a GHC bug:
http://www.haskell.org/ghc/reportabug
Aborted (core dumped)
}}}
Things worth noting:
1. I think the program fails in the final GC that runs right before the
program terminates. We can see that it produces a correct result of
`(Nothing,Just "\rhas\narrived")`, but something on the heap has
definitely been corrupted.
2. This only happens with `-O2` turned on.
3. This only happens when the parser does not successfully parse its
input.
Here's some more context around this. I've been working on a parser that
uses unboxed sums instead of continuations. After #15038 was fixed,
everything had been going well. Then, I took the parser type and added two
things to it: (1) context and (2) typed errors. Context is basically like
throwing `StateT` on top and errors are like throwing `ExceptT` on top.
After this, everything in my test suite kept working except for a single
test, which now consistently crashes my test suite. So, I originally had
this:
{{{#!hs
type Bytes# = (# ByteArray#, Int#, Int# #)
type Maybe# (a :: TYPE r) = (# (# #) | a #)
type Leftovers# s = (# Bytes# , ByteStream s #)
type Result# s (r :: RuntimeRep) (a :: TYPE r) =
(# Maybe# (Leftovers# s), Maybe# a #)
newtype ParserLevity (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
{ getParserLevity :: forall s.
Maybe# (Leftovers# s)
-> State# s
-> (# State# s, Result# s r a #)
}
}}}
But I changed it to this:
{{{#!hs
type Bytes# = (# ByteArray#, Int#, Int# #)
type Maybe# (a :: TYPE r) = (# (# #) | a #)
type Either# a (b :: TYPE r) = (# a | b #)
type Leftovers# s = (# Bytes# , ByteStream s #)
type Result# e c s (r :: RuntimeRep) (a :: TYPE r) =
(# Maybe# (Leftovers# s), Either# (Maybe e) a, c #)
newtype ParserLevity e c (r :: RuntimeRep) (a :: TYPE r) = ParserLevity
{ getParserLevity :: forall s.
c
-> Maybe# (Leftovers# s)
-> State# s
-> (# State# s, Result# e c s r a #)
}
}}}
Specifically, the function causing trouble is (as currently defined):
{{{#!hs
{-# NOINLINE takeBytesUntilEndOfLineConsumeUnboxed #-}
takeBytesUntilEndOfLineConsumeUnboxed :: ParserLevity e c BytesRep Bytes#
takeBytesUntilEndOfLineConsumeUnboxed = ParserLevity (go (# (# #) | #))
where
go :: Maybe# Bytes# -> c -> Maybe# (Leftovers# s) -> State# s -> (#
State# s, Result# e c s BytesRep Bytes# #)
go !_ c (# (# #) | #) s0 = (# s0, (# (# (# #) | #), (# Nothing | #), c
#) #)
go !mbytes c (# | (# bytes0@(# arr0, off0, len0 #), !stream0@(ByteStream
streamFunc) #) #) s0 = case BAW.findAnyByte2 (I# off0) (I# len0) 10 13
(ByteArray arr0) of
Nothing -> case streamFunc s0 of
(# s1, r #) -> go (# | appendMaybeBytes mbytes bytes0 #) c r s1
Just (I# ix, W8# theByte) -> case theByte of
10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 1# ) bytes0,
stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #)
#), c #) #)
-- second case means it was 13
_ -> case ix <# (off0 +# len0 -# 1#) of
1# -> case indexWord8Array# arr0 (ix +# 1# ) of
10## -> (# s0, (# (# | (# unsafeDrop# ((ix -# off0) +# 2# )
bytes0, stream0 #) #), (# | appendMaybeBytes mbytes (# arr0, off0, ix -#
off0 #) #), c #) #)
_ -> (# s0, (# (# | (# unsafeDrop# (ix -# off0) bytes0, stream0
#) #), (# Nothing | #), c #) #)
_ -> case nextNonEmpty stream0 s0 of
(# s1, m #) -> case m of
(# (# #) | #) -> (# s1, (# (# | (# unboxBytes (B.singleton
13), Stream.empty #) #), (# Nothing | #), c #) #)
(# | (# bytes1@(# arr1, _, _ #), stream1 #) #) -> case
indexWord8Array# arr1 0# of
10## -> (# s1, (# (# | (# unsafeDrop# 1# bytes1, stream1 #)
#), (# | appendMaybeBytes mbytes (# arr0, off0, ix -# off0 #) #), c #) #)
_ -> (# s1, (# (# | (# unboxBytes (B.cons 13 (boxBytes
bytes1)), stream1 #) #), (# Nothing | #), c #) #)
}}}
That's all I've got for now. If no one's able to make headway, I'll
probably come back to this and try to make a more minimal example at some
point. I won't have time to do this soon though.
--
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15300#comment:6>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list