cvs commit: hugs98/src prelude.h static.c storage.c storage.h subst.c

Sigbjorn Finne sof@glass.cse.ogi.edu
Wed, 22 Jan 2003 11:15:23 -0800


sof         2003/01/22 11:15:23 PST

  Modified files:
    src                  prelude.h static.c storage.c storage.h 
                         subst.c 
  Log:
  Stephen Milborrow's optimisations to whatIs() -- enabled by defining
  FAST_WHATIS to 1 (which it is now by default.)
  
  Here's the original e-mail describing the changes made (and their
  effect):
  
  From: "Stephen Milborrow" <milbo@icon.co.za>
  To: <hugs-users@haskell.org>
  Subject: A Faster whatIs
  Date: Sat, 30 Nov 2002 11:10:47 +0200
  
  Hello everyone:
  
  I made some changes to the Hugs sources which give small runtime
  speed gains.  The results are in the table below.  The gains are
  minor, but I thought some people in the group might be interested.
  
                                     WHATIS   WHATIS1    FHEAP
  
  Speed Increase (percent)
                gamteb                 8         9         5
                parser                 6         8       gcfail
                prolog                 9         9         2
                queens                10        13         4
  
  Extra BSS memory (kBytes)           24        24         0
  
  Nbr source lines changed            50       144        41
  
  ---Notes On The Table
  
  The table headings WHATIS, WHATIS1, and FHEAP refer to three
  different sets of changes.  WHATIS and WHATIS1 are described
  below.  FHEAP was an experiment that simply used a fixed size
  heap (an array instead of a malloc) and is impractical because
  it rules out use of the Hugs -h flag.
  
  The (motley) collection of test programs are from nofib.
  I timed runhugs for each program without command line flags,
  except that for queens I used -h1000000.  Different programs
  would probably give different results: the usual caveats apply.
  These programs were just the nofib ones I had at hand.
  
  I compiled with MSC 5.0 Service Pack 3 using modified Nov 2002
  Hugs sources. I ran the timing tests on a Windows 98 machine.
  
  ---Strategy
  
  I wanted to see what speed gains could be achieved by taking a
  worm's eye view of the code, with no changes to fundamental
  algorithms.  I also wanted to make changes that would be limited
  to just a few places in the code -- a minimal force approach.
  
  To start off, I ran the MSC profiler.  This confirmed that
  whatIs() is a candidate for optimization, as already noted in
  Mark's Gopher implementation document and in comments in
  storage.c.  But the profiler showed a few other hotspots too.
  (An interesting thing to do is sort the profiler results on
  execution line-count, which immediately tells you which are the
  most executed lines in the program.)
  
  ---WHATIS Change
  
  To reduce the time spent in whatIs, I created a byte array
  whatCode of whatIs codes.  I then created a whatIs macro which
  replaces the whatIs function:
  
    #define whatIs(c) (isPair(c)?                        \
                        (isTag(fst(c)) ? fst(c) : AP ) : \
                        whatCode[c])
  
  Negative indexing into whatCode is prevented by the isPair.  To
  keep the size of whatCode reasonable, I had to reduce the range
  of unboxed ints -- the bigger the range, the bigger the whatCode
  array.  I settled on a range of 2048 i.e. ints between -1023 and
  1024 are unboxed, all others are boxed. The extra boxing will
  actually slow down execution of certain Haskell programs, but as
  far as I can tell the majority of real Hugs programs would be
  unaffected.  Increasing this range to, say, 10 000 would increase
  memory usage by 10 000 bytes -- nothing really when you consider
  that the memory footprint of winhugs is about 10 MBytes.
  
  This change yields the speed improvements under the WHATIS
  column in the table.
  
  ---WHATIS1 Change
  
  IsTag is defined as
  
    #define isTag(c)   (TAGMIN<=(c) && (c)<SPECMIN)
  
  I reduced the cost of IsTag slightly by changing defines in
  storage.h so that only box-cell-tags are in the range 1 to
  0x7f. (I shifted special cell values down to start at
  0x80).  I then defined variants for IsTag and friends:
  
    #define isTag1(c)  (((c) & TAG_MASK) == 0)
  
    #define whatIs1(c) (isPair(c)?                         \
                          (isTag1(fst(c))? fst(c) : AP ) : \
                          whatCode[c])
  
    #define isAp1(c)   (isPair(c) && !isTag1(fst(c)))
  
  whatIs1 is a faster version of whatIs, with the "bug" that it
  will return the wrong value if fst(c) is NIL.  When used in
  several places as a replacement for whatIs, it gains us a few
  speed percentage points as shown in the WHATIS1 column above.
  WHATIS1 sits on top of WHATIS so the gains attributable to
  WHATIS1 alone are the differences between the percentages in the
  two columns.
  
  ---Another Candidate
  
  Another candidate for this kind of optimization is the line
  in eval():
  
     if (!isCfun(n) && (ar=name(n).arity)<=(sp-base)) {...
  
  This is one of the most executed lines in the entire program.
  
  If the stored arity of Cfun's was offset by a largish number,
  say 10000, then the above test against isCfun wouldn't be
  needed.  This change would introduce inefficiencies elsewhere
  (we would have to un-offset the stored arity before using it
  elsewhere) but the net effect would probably be a speed gain.  I
  shied away from this change because it would require ubiquitous
  (though easy) code changes.
  
  ---Final Comments
  
  Optimizing code written by Mark Jones is a challenge.  It
  becomes a little less daunting if we change the rules by
  allowing ourselves to waste some memory in the pursuit of speed,
  and to introduce some ugliness into the code. Even so, the speed
  improvements I got were small.
  
  If anyone is interested, I would be happy to send the sources.
  They are modified Nov 2002 sources with all the changes
  demarcated by #define's. Six files changed.
  
  I would be interested to know what results these changes give on
  other machines.
  
  ---
  Stephen Milborrow
  
  Revision  Changes    Path
  1.40      +7 -2      hugs98/src/prelude.h
  1.134     +7 -2      hugs98/src/static.c
  1.59      +38 -2     hugs98/src/storage.c
  1.48      +86 -2     hugs98/src/storage.h
  1.26      +6 -2      hugs98/src/subst.c