[Haskell] Space behaviour & hyperseq

Fri Jun 18 07:21:50 EDT 2004

Arjan,

>I supervise a student who uses Haskell for simulating neural nets. A lot of
>computation goes on there and the program cannot handle as many iterations
>as we would like. We have improved performance by a factor of more than ten
>by putting strictness annotations in all data types. But the problem is that
>there are lists within these data structures and they are not strict. By
>using a trick I have now found that making the whole program state strict
>the program gets a much better looking heap profile. The trick I use is by
>writing a function hyperseq and then applying it before the next iterations
>begins:
>
>hyperseq x y = if x==x then y else error "this is very unlikely"
>
>This is very expensive and I chose to apply the function only once every 100
>iterations. This gives reasonable performance.
>
>...
>1) Is there a more efficient definition of hyperseq that does not traverse
>the data structure twice? The "show" function traverses the structure once
>but I found it to be much slower.
>  
>
I hit a similar problem with my very first Haskell application! As you 
say, the brute solution of comparing structures for equality with 
themselves is very expensive, and may be wrong if some components really 
should be evaluated lazily. My solution was to introduce a single-method 
class

class Norm a where normal :: a -> Bool

The intention  is that 'normal' functions are defined in such a way that 
the result of an application 'normal x' is always true, but computing 
this truth guarantees that x is evaluated to at least a desired minimal 
extent.  For example, supppose we have

data Tree a b = Branch (Tree a) a (Tree a) | Leaf b

we might then define

instance Norm (Tree a b) where
  normal  (Leaf _) = True
  normal  (Branch _ _ _) = True

hiding the Leaf and Branch constructors and providing instead

leaf :: Norm b => b  -> Tree a b
leaf x | normal x = Leaf x

branch :: Tree a b -> b -> Tree a b -> Tree a b
branch lt x rt | normal lt && normal rt = Branch lt x rt

we obtain trees that are path-strict in the branch and leaf structure, 
do not interfere at all with the lazy evaluation of internal labels at 
branches but ensure 'normal' evaluation of leaf labels.

Of course there are many other possible definitions of 'normal' for 
trees. All sorts of clever tricks could be programmed into the 
normalising evaluations. But there should be no need for repeated 
traversals (let alone repeated double traversals, using ==) of 
already-evaluated structure.

With a suitable instance of Norm, your hyperseq function can be redefined

hyperseq :: Norm a => a -> b -> b
hyperseq x y | normal x = y

Regards
Colin R

Reference:  Colin Runciman, "Tip in Haskell -- another exercise in 
functional programming", pp 278-292 in Proc. 1991 Glasgow Workshop on 
Functional Programming, Springer-Verlag, 1992.