stream fusion, concatMap, exisential seed unboxing

Christian Höner zu Siederdissen choener at tbi.univie.ac.at
Mon Feb 2 16:02:09 UTC 2015


Sure, no problem!

Btw. this is not a 'bug' in the usual sense. It is the (neverending)
concatMap + stream fusion story.
https://ghc.haskell.org/trac/ghc/ticket/915
I'm playing a bit with trying to get GHC to look through the existential
seed elements and have it constructor-specialize them.

Unfortunately, unbox/spec fails with more complex seeds for now. For the more
simple cases like the one below, the extra strictness pass works (cool,
thanks!). These are sometimes enough if you have just
concatMap (\i -> [i .. j]) stuff.

However, if the internal stream state is, say, a pair (i,j), then one of
those is not completely unboxed. I guess that *if* we could get the
passes to continue unbox'ing and ctor-spec'ing, we could end up with a
fully fused concatMap.

I'll put a complete git repository with criterion + quickcheck modules
up (soonishly ;-).

Viele Gruesse,
Christian

* Simon Peyton Jones <simonpj at microsoft.com> [02.02.2015 15:49]:
> I think it'd help you to open a Trac ticket, give a fully-reproducible test case, including instructions for how to reproduce, and say what isn't happening that should happen. 
> 
> What's odd is that loop_s29q looks strict in its Int arg, yet isn't unboxed.  There is a way to get the strictness analysis to run twice -flate-dmd-anal.  You could try that.
> 
> Simon
> 
> |  -----Original Message-----
> |  From: Glasgow-haskell-users [mailto:glasgow-haskell-users-
> |  bounces at haskell.org] On Behalf Of Christian Höner zu Siederdissen
> |  Sent: 01 February 2015 12:18
> |  To: Glasgow-Haskell-Users
> |  Subject: stream fusion, concatMap, exisential seed unboxing
> |  
> |  Hi everybody,
> |  
> |  I'm playing around with concatMap in stream fusion (the vector package
> |  to be exact).
> |  
> |  concatMapM :: Monad m => (a->m (Stream m b)) -> Stream m a -> Stream m
> |  b concatMapM f (Stream ...) = ...
> |  
> |  I can get my concatMap to behave nicely and erase all Stream and Step
> |  constructors but due to the existential nature of the Stream seeds,
> |  they are re-boxed for the inner stream (which is kind-of annoying
> |  given that the seed is immediately unboxed again ;-). seq doesn't help
> |  here.
> |  
> |  Otherwise, fusion happens for streams and vectors, so that is ok. But
> |  boxing kills performance, criterion says.
> |  
> |  Do we have s.th. in place that could help here? Currently I could use
> |  the vector-concatMap which creates intermediate arrays, my version
> |  which has boxed seeds, or hermit but that is too inconvenient for non-
> |  ghc savy users.
> |  
> |  Viele Gruesse,
> |  Christian
> |  
> |  
> |  
> |  Fusing concatMapM:
> |  
> |  concatMapM f (SM.Stream ostep t _) = SM.Stream step (Left t) Unknown
> |    where step (Left t) = do r <- ostep t
> |                             case r of
> |                              SM.Done       -> return $ SM.Done
> |                              SM.Skip    t' -> return $ SM.Skip (Left
> |  t')
> |                              SM.Yield a t' -> do s <- f a
> |                                                  return $ SM.Skip
> |  (Right (s,t'))
> |          step (Right (SM.Stream istep s _,t)) = do r <- istep s
> |                                                    case r of
> |                                                      SM.Done       ->
> |  return $ SM.Skip    (Left t)
> |                                                      SM.Skip    s' ->
> |  return $ SM.Skip    (Right (SM.Stream istep s' Unknown,t))
> |                                                      SM.Yield x s' ->
> |  return $ SM.Yield x (Right (SM.Stream istep s' Unknown,t))
> |          {-# INLINE [0] step #-}
> |  {-# INLINE [1] concatMapM #-}
> |  
> |  testConcatMapM :: Int -> Int
> |  testConcatMapM k = seq k $ U.unId
> |                   . SM.foldl' (+) 0
> |                   . concatMap (\i -> SM.enumFromTo 5 k)
> |                   $ SM.enumFromTo 3 k
> |  {-# NOINLINE testConcatMapM #-}
> |  
> |  CORE:
> |  
> |  testConcatMapM
> |  testConcatMapM =
> |    \ k_aCA ->
> |      let! { I# ipv_s1xv ~ _ <- k_aCA } in ### inner loop
> |      letrec {
> |        $s$wfoldlM'_loop_s29q
> |        $s$wfoldlM'_loop_s29q =
> |          \ sc_s29i sc1_s29j sc2_s29k ->
> |  ### unboxing
> |            let! { I# x_a1LA ~ _ <- sc1_s29j } in
> |            case tagToEnum# (<=# x_a1LA ipv_s1xv) of _ {
> |              False -> $s$wfoldlM'_loop1_s29c sc_s29i sc2_s29k;
> |              True ->
> |                $s$wfoldlM'_loop_s29q
> |  ### reboxing
> |                  (+# sc_s29i x_a1LA) (I# (+# x_a1LA 1)) sc2_s29k
> |            };
> |  ### outer loop
> |        $s$wfoldlM'_loop1_s29c
> |        $s$wfoldlM'_loop1_s29c =
> |          \ sc_s29a sc1_s29b ->
> |            case tagToEnum# (<=# sc1_s29b ipv_s1xv) of _ {
> |              False -> sc_s29a;
> |              True ->
> |                case tagToEnum# (<=# 5 ipv_s1xv) of _ {
> |                  False -> $s$wfoldlM'_loop1_s29c sc_s29a (+# sc1_s29b
> |  1); ### boxed seed (I# 6)
> |                  True -> $s$wfoldlM'_loop_s29q (+# sc_s29a 5) (I# 6)
> |  (+# sc1_s29b 1)
> |                }
> |            }; } in
> |      let! { __DEFAULT ~ ww_s20G <- $s$wfoldlM'_loop1_s29c 0 3 } in
> |      I# ww_s20G
> |  
> |  _______________________________________________
> |  Glasgow-haskell-users mailing list
> |  Glasgow-haskell-users at haskell.org
> |  http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


More information about the Glasgow-haskell-users mailing list