[Haskell] Space behaviour & hyperseq
Colin Runciman
colin at cs.york.ac.uk
Fri Jun 18 07:21:50 EDT 2004
Arjan,
>I supervise a student who uses Haskell for simulating neural nets. A lot of
>computation goes on there and the program cannot handle as many iterations
>as we would like. We have improved performance by a factor of more than ten
>by putting strictness annotations in all data types. But the problem is that
>there are lists within these data structures and they are not strict. By
>using a trick I have now found that making the whole program state strict
>the program gets a much better looking heap profile. The trick I use is by
>writing a function hyperseq and then applying it before the next iterations
>begins:
>
>hyperseq x y = if x==x then y else error "this is very unlikely"
>
>This is very expensive and I chose to apply the function only once every 100
>iterations. This gives reasonable performance.
>
>...
>1) Is there a more efficient definition of hyperseq that does not traverse
>the data structure twice? The "show" function traverses the structure once
>but I found it to be much slower.
>
>
I hit a similar problem with my very first Haskell application! As you
say, the brute solution of comparing structures for equality with
themselves is very expensive, and may be wrong if some components really
should be evaluated lazily. My solution was to introduce a single-method
class
class Norm a where normal :: a -> Bool
The intention is that 'normal' functions are defined in such a way that
the result of an application 'normal x' is always true, but computing
this truth guarantees that x is evaluated to at least a desired minimal
extent. For example, supppose we have
data Tree a b = Branch (Tree a) a (Tree a) | Leaf b
we might then define
instance Norm (Tree a b) where
normal (Leaf _) = True
normal (Branch _ _ _) = True
hiding the Leaf and Branch constructors and providing instead
leaf :: Norm b => b -> Tree a b
leaf x | normal x = Leaf x
branch :: Tree a b -> b -> Tree a b -> Tree a b
branch lt x rt | normal lt && normal rt = Branch lt x rt
we obtain trees that are path-strict in the branch and leaf structure,
do not interfere at all with the lazy evaluation of internal labels at
branches but ensure 'normal' evaluation of leaf labels.
Of course there are many other possible definitions of 'normal' for
trees. All sorts of clever tricks could be programmed into the
normalising evaluations. But there should be no need for repeated
traversals (let alone repeated double traversals, using ==) of
already-evaluated structure.
With a suitable instance of Norm, your hyperseq function can be redefined
hyperseq :: Norm a => a -> b -> b
hyperseq x y | normal x = y
Regards
Colin R
Reference: Colin Runciman, "Tip in Haskell -- another exercise in
functional programming", pp 278-292 in Proc. 1991 Glasgow Workshop on
Functional Programming, Springer-Verlag, 1992.
More information about the Haskell
mailing list