[Haskell-beginners] pattern match vs. pure function
Daniel Fischer
daniel.is.fischer at web.de
Tue Oct 19 20:23:16 EDT 2010
On Wednesday 20 October 2010 01:23:49, Bastian Erdnüß wrote:
> Suppose I would want to write a function rmap that acts like
>
> rmap [f,g,h] x == [f x, g x, h x]
>
> Do I gain any advantage or disadvantage (beside readability) from using
>
> rmap = flip $ map . flip id
>
> over
>
> rmap [] _ = []
> rmap (f:fs) x = f x : rmap fs x
>
> ?
>
> I could imagine Haskell can optimize the first version better since it
> refers to the built in map function. But beside that, does Haskell
> struggle with the combinatory stuff in the first version? Or does it
> get optimized away?
Well, you can ask GHC what it does with the definitions.
Let's compile
module RMap where
rmap :: [a -> b] -> a -> [b]
rmap = flip $ map . flip id
recmap :: [a -> b] -> a -> [b]
recmap [] _ = []
recmap (f:fs) x = f x : recmap fs x
and look at the generated Core (obtained via -ddump-simpl).
The @ x_yz things are type annotations, otherwise Core is pretty close to
Haskell. It takes a bit to get used to reading Core, but it's not that
difficult (as long as the functions are short).
First, without optimisations:
==================== Tidy Core ====================
Rec {
RMap.recmap :: forall a_adg b_adh.
[a_adg -> b_adh] -> a_adg -> [b_adh]
GblId
[Arity 2
NoCafRefs]
RMap.recmap =
\ (@ a_adA)
(@ b_adB)
(ds_dee :: [a_adA -> b_adB])
(ds1_def :: a_adA) ->
case ds_dee of _ {
[] -> GHC.Types.[] @ b_adB;
: f_adk fs_adl ->
GHC.Types.:
@ b_adB
(f_adk ds1_def)
(RMap.recmap @ a_adA @ b_adB fs_adl ds1_def)
}
end Rec }
RMap.rmap :: forall a_adi b_adj.
[a_adi -> b_adj] -> a_adi -> [b_adj]
GblId
[]
RMap.rmap =
\ (@ a_adE) (@ b_adF) ->
GHC.Base.$
@ (a_adE -> [a_adE -> b_adF] -> [b_adF])
@ ([a_adE -> b_adF] -> a_adE -> [b_adF])
(GHC.Base.flip @ a_adE @ [a_adE -> b_adF] @ [b_adF])
(GHC.Base..
@ ((a_adE -> b_adF) -> b_adF)
@ ([a_adE -> b_adF] -> [b_adF])
@ a_adE
(GHC.Base.map @ (a_adE -> b_adF) @ b_adF)
(GHC.Base.flip
@ (a_adE -> b_adF)
@ a_adE
@ b_adF
(GHC.Base.id @ (a_adE -> b_adF))))
-- both are pretty exactly the code written in Haskell.
The explicit recursion recmap is small and efficient, it looks at the list
whether there are still functions left, if so, apply the first of those to
the value and recur. No cruft here.
The other one, flip $ map . flip id is, well it's small too, but it's
horrible to look at (all those @s). Every time you invoke that function,
you call flip twice, ($), map, (.) and id. If you use the function on short
lists, those calls generate significant overhead, but for long lists that
doesn't matter.
Now with optimisations. It doesn't matter whether you use -O1 or -O2, both
produce exactly the same Core for both functions.
And, for recmap, they produce exactly the same Core as -O0 did.
So, recmap is pretty good code, GHC doesn't know how to improve it.
rmap on the other hand changed:
RMap.rmap :: forall a_adi b_adj.
[a_adi -> b_adj] -> a_adi -> [b_adj]
GblId
[Arity 2
NoCafRefs
Str: DmdType SL]
RMap.rmap =
\ (@ a_adE)
(@ b_adF)
(x_aeq :: [a_adE -> b_adF])
(y_aer :: a_adE) ->
GHC.Base.map
@ (a_adE -> b_adF)
@ b_adF
(\ (y1_XeE :: a_adE -> b_adF) -> y1_XeE y_aer)
x_aeq
There, that looks much better than before (and for short lists, it performs
much better, but for long lists, the difference should be negligible).
flip, ($) and (.) have been inlined and eliminated, what we get is
rmap fs x = map (\f -> f x) fs
, which is really nice. In fact, it's even nicer than what we got for
recmap, because the compiler knows map, there are rewrite rules, which can
produce much better code when the function is used, for example, it's
possible that the list of functions is completely eliminated and a use is
rewritten to a direct loop instead of allocating a list cell for each
function in the list. And it can profit from the rule
map f . map g = map (f . g)
which is much harder to detect for the direct recursion.
> And how "expensive" are pattern matches on the other side, anyway?
Depends on the pattern, of course. Matching "oh" against [] is much cheaper
than matching "This is an expensive car" against
'T':'h':'i':'s':' ':'a':'n':' ':'e':'x':'p':'e':'n':'s':'i':'v':'e':'
':'p':'a':'t':'t':'e':'r':'n':' ':'m':'a':'t':'c':'h':_
But generally, pattern matching is quite cheap. The (well, one) advantage
of higher order functions like map is that they're easier to optimise when
they're used, given that somebody has taught the compiler how.
>
> Thanks for reading,
> Bastian
More information about the Beginners
mailing list