[Haskell-cafe] Question on optimization
Viktor Dukhovni
ietf-dane at dukhovni.org
Thu Dec 17 06:48:34 UTC 2020
On Thu, Dec 17, 2020 at 12:15:19AM -0500, Viktor Dukhovni wrote:
> I don't know whether indirection via "foldr = go" with `go`
> recursive and `foldr` "INLINABLE" and ditto for `traverse` would
> create further opportunities for optimisation when the instances
> are in a separate module. Someone else might clear that up, or
> you could benchmark and see. IIRC, recursive functions don't
> directly inline without such tweaks, but the tweaks might not
> help much.
Naive experiments seem to suggest that inlining via a `go` indirection
that captures `f` is somewhat helpful, but the real win is using a right
fold, vs. your original flatten implementation. Here's a comparison
with a somewhat branchy deep structure.
Flatten via Foldable:
---------------------
2,281,024 bytes allocated in the heap
3,312 bytes copied during GC
44,408 bytes maximum residency (1 sample(s))
25,224 bytes maximum slop
17 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 0 colls, 0 par 0.000s 0.000s 0.0000s 0.0000s
Gen 1 1 colls, 0 par 0.000s 0.000s 0.0005s 0.0005s
INIT time 0.000s ( 0.000s elapsed)
MUT time 0.002s ( 0.002s elapsed)
GC time 0.000s ( 0.000s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.003s ( 0.003s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 1,234,320,346 bytes per MUT second
Productivity 68.0% of total user, 68.0% of total elapsed
The posted flatten (1000x more allocations! 200x runtime):
--------------------
2,651,027,056 bytes allocated in the heap
11,386,968 bytes copied during GC
164,424 bytes maximum residency (2 sample(s))
29,320 bytes maximum slop
27 MiB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 104 colls, 0 par 0.015s 0.015s 0.0001s 0.0004s
Gen 1 2 colls, 0 par 0.001s 0.001s 0.0007s 0.0010s
INIT time 0.001s ( 0.001s elapsed)
MUT time 0.715s ( 0.715s elapsed)
GC time 0.016s ( 0.016s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 0.732s ( 0.732s elapsed)
%GC time 0.0% (0.0% elapsed)
Alloc rate 3,705,952,327 bytes per MUT second
Productivity 97.7% of total user, 97.7% of total elapsed
--
Viktor.
{-# LANGUAGE DeriveFunctor #-}
module Nest
( Nest(..)
) where
data Nest a = Nil
| Cons a (Nest a)
| Nest (Nest a) (Nest a)
deriving (Show, Functor)
instance Foldable Nest where
{-# INLINE foldr #-}
foldr f = go
where
go z Nil = z
go z (Cons x n) = f x (go z n)
go z (Nest x y) = go (go z y) x
instance Traversable Nest where
{-# INLINE traverse #-}
traverse f = go
where
go Nil = pure Nil
go (Cons x n) = Cons <$> f x <*> go n
go (Nest x y) = Nest <$> go x <*> go y
More information about the Haskell-Cafe
mailing list