stream fusion, concatMap, exisential seed unboxing
Simon Peyton Jones
simonpj at microsoft.com
Mon Feb 2 17:08:45 UTC 2015
Ah, well, if it's really the concat/concatMap problem then I'm really not sure how to crack it.
But there are lots of smart people on this list, so maybe someone else can.
The fewer dependencies your test case has the better. eg Don't use criterion; this stuff is huge: you get 10G of allocation in your test run instead of 10M. Or something.
Simon
| -----Original Message-----
| From: Christian Höner zu Siederdissen
| [mailto:choener at tbi.univie.ac.at]
| Sent: 02 February 2015 16:02
| To: Simon Peyton Jones
| Cc: Glasgow-Haskell-Users
| Subject: Re: stream fusion, concatMap, exisential seed unboxing
|
| Sure, no problem!
|
| Btw. this is not a 'bug' in the usual sense. It is the (neverending)
| concatMap + stream fusion story.
| https://ghc.haskell.org/trac/ghc/ticket/915
| I'm playing a bit with trying to get GHC to look through the
| existential seed elements and have it constructor-specialize them.
|
| Unfortunately, unbox/spec fails with more complex seeds for now. For
| the more simple cases like the one below, the extra strictness pass
| works (cool, thanks!). These are sometimes enough if you have just
| concatMap (\i -> [i .. j]) stuff.
|
| However, if the internal stream state is, say, a pair (i,j), then one
| of those is not completely unboxed. I guess that *if* we could get the
| passes to continue unbox'ing and ctor-spec'ing, we could end up with a
| fully fused concatMap.
|
| I'll put a complete git repository with criterion + quickcheck modules
| up (soonishly ;-).
|
| Viele Gruesse,
| Christian
|
| * Simon Peyton Jones <simonpj at microsoft.com> [02.02.2015 15:49]:
| > I think it'd help you to open a Trac ticket, give a fully-
| reproducible test case, including instructions for how to reproduce,
| and say what isn't happening that should happen.
| >
| > What's odd is that loop_s29q looks strict in its Int arg, yet isn't
| unboxed. There is a way to get the strictness analysis to run twice -
| flate-dmd-anal. You could try that.
| >
| > Simon
| >
| > | -----Original Message-----
| > | From: Glasgow-haskell-users [mailto:glasgow-haskell-users-
| > | bounces at haskell.org] On Behalf Of Christian Höner zu Siederdissen
| > | Sent: 01 February 2015 12:18
| > | To: Glasgow-Haskell-Users
| > | Subject: stream fusion, concatMap, exisential seed unboxing
| > |
| > | Hi everybody,
| > |
| > | I'm playing around with concatMap in stream fusion (the vector
| > | package to be exact).
| > |
| > | concatMapM :: Monad m => (a->m (Stream m b)) -> Stream m a ->
| > | Stream m b concatMapM f (Stream ...) = ...
| > |
| > | I can get my concatMap to behave nicely and erase all Stream and
| > | Step constructors but due to the existential nature of the Stream
| > | seeds, they are re-boxed for the inner stream (which is kind-of
| > | annoying given that the seed is immediately unboxed again ;-).
| seq
| > | doesn't help here.
| > |
| > | Otherwise, fusion happens for streams and vectors, so that is ok.
| > | But boxing kills performance, criterion says.
| > |
| > | Do we have s.th. in place that could help here? Currently I could
| > | use the vector-concatMap which creates intermediate arrays, my
| > | version which has boxed seeds, or hermit but that is too
| > | inconvenient for non- ghc savy users.
| > |
| > | Viele Gruesse,
| > | Christian
| > |
| > |
| > |
| > | Fusing concatMapM:
| > |
| > | concatMapM f (SM.Stream ostep t _) = SM.Stream step (Left t)
| Unknown
| > | where step (Left t) = do r <- ostep t
| > | case r of
| > | SM.Done -> return $ SM.Done
| > | SM.Skip t' -> return $ SM.Skip
| (Left
| > | t')
| > | SM.Yield a t' -> do s <- f a
| > | return $ SM.Skip
| > | (Right (s,t'))
| > | step (Right (SM.Stream istep s _,t)) = do r <- istep s
| > | case r of
| > | SM.Done
| ->
| > | return $ SM.Skip (Left t)
| > | SM.Skip s'
| ->
| > | return $ SM.Skip (Right (SM.Stream istep s' Unknown,t))
| > | SM.Yield x s'
| > | -> return $ SM.Yield x (Right (SM.Stream istep s' Unknown,t))
| > | {-# INLINE [0] step #-}
| > | {-# INLINE [1] concatMapM #-}
| > |
| > | testConcatMapM :: Int -> Int
| > | testConcatMapM k = seq k $ U.unId
| > | . SM.foldl' (+) 0
| > | . concatMap (\i -> SM.enumFromTo 5 k)
| > | $ SM.enumFromTo 3 k {-# NOINLINE testConcatMapM
| > | #-}
| > |
| > | CORE:
| > |
| > | testConcatMapM
| > | testConcatMapM =
| > | \ k_aCA ->
| > | let! { I# ipv_s1xv ~ _ <- k_aCA } in ### inner loop
| > | letrec {
| > | $s$wfoldlM'_loop_s29q
| > | $s$wfoldlM'_loop_s29q =
| > | \ sc_s29i sc1_s29j sc2_s29k -> ### unboxing
| > | let! { I# x_a1LA ~ _ <- sc1_s29j } in
| > | case tagToEnum# (<=# x_a1LA ipv_s1xv) of _ {
| > | False -> $s$wfoldlM'_loop1_s29c sc_s29i sc2_s29k;
| > | True ->
| > | $s$wfoldlM'_loop_s29q ### reboxing
| > | (+# sc_s29i x_a1LA) (I# (+# x_a1LA 1)) sc2_s29k
| > | };
| > | ### outer loop
| > | $s$wfoldlM'_loop1_s29c
| > | $s$wfoldlM'_loop1_s29c =
| > | \ sc_s29a sc1_s29b ->
| > | case tagToEnum# (<=# sc1_s29b ipv_s1xv) of _ {
| > | False -> sc_s29a;
| > | True ->
| > | case tagToEnum# (<=# 5 ipv_s1xv) of _ {
| > | False -> $s$wfoldlM'_loop1_s29c sc_s29a (+#
| > | sc1_s29b 1); ### boxed seed (I# 6)
| > | True -> $s$wfoldlM'_loop_s29q (+# sc_s29a 5) (I#
| 6)
| > | (+# sc1_s29b 1)
| > | }
| > | }; } in
| > | let! { __DEFAULT ~ ww_s20G <- $s$wfoldlM'_loop1_s29c 0 3 } in
| > | I# ww_s20G
| > |
| > | _______________________________________________
| > | Glasgow-haskell-users mailing list
| > | Glasgow-haskell-users at haskell.org
| > | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
More information about the Glasgow-haskell-users
mailing list