AW: What does "Compiled code too complex" error message of Hugs mean?

Mark P Jones mpj@cse.ogi.edu
Mon, 12 Feb 2001 11:07:03 -0800


I know this is more than a week old, but I've been away ... and
now that I'm back, I'd like to clear up a possible misunderstanding
that began with an exchange along the following lines:

| > > When loading some Haskell files with Hugs, I get the error=20
| > > message "Compiled code too complex". However, the compilation=20
| > > with GHC 4.08.1 succeeds. =20
| > > What does this message mean? What can I do about it?
| ...
| > You can grep for that sentence in "hugs98/src", it will point to the
| > file "machine.c". There you will see it says "if =
nextLab>=3DNUM_FIXUPS) ...".
| > So grep for "NUM_FIXUPS" it will point to the file "prelude.h". I
| > think the default value is 400, you should increase it to 1000 or =
so.
| > I have it at 10000, but that's probably not necesary in your case
| > and if you increase constants too much starting up Hugs will become
| > slower.

First of all, an explanation.  Inside Hugs, the code for Haskell
functions is translated into a low-level, abstract machine language.
(In today's environment, the way that Java programs are translated
into JVM bytecode is probably a good analogy.)  As the machine
language code is generated, the compiler sometimes needs to insert
"jump" instructions to addresses that are not yet known.  In this
situation, it instead inserts a dummy address, but adds an entry to
a simple table of "fixups" that will later be filled in with the
correct address.  Once the complete section of code has been generated,
Hugs scans over it once again and replaces each unknown address with
the correct value from the fixups table.  This process is also quite
commonly described as "back-patching".

The fixups table can contain at most NUM_FIXUPS entries, which, in the
current distribution, is set to 400.  Programs that require more entries
than this in a single block of code will lead to the "Compiled code too
complex" error message that you have seen.  There is not particular
reason for the choice of 400; this is just a number that seemed to work
ok for most practical purposes.  If you get a compiled code too complex
message, it is perhaps a sign that you would benefit from looking at
ways to break your code down into smaller, simpler, and more=20
understandable pieces.  More likely, however, you will see this error
with code that was generated automatically, in response to a "deriving"
request on a datatype, or by a tool like a parser generator.  In this
case, changing the Haskell code that is generated is not an option.

Increasing the standard setting for NUM_FIXUPS is certainly an option
here.  You would have to increase it a great deal for there to be any
impact on the speed with which Hugs starts.  In comparison to the
standard heap sizes that people use, the fixups table is a drop in the
ocean.  I think Johan has already increased the size for the Feb 2001
distribution that will be out in a couple of days.

The fixups table could be allocated dynamically, and expand on demand.
To understand why Hugs doesn't do it that way already, you need to go
back more than a decade to Gofer, the system from which Hugs was
derived, which was designed to work on an 8086 with segmented memory
and a maximum of 640K.  Back in those days, when loading the prelude
took 30 seconds (and it was much smaller then too!), a statically
allocated table made sense because it was slightly faster and because
there wasn't any spare memory for it to expand into anyway, even if
you went to the trouble of dynamically allocating it.

Historical note: If you'd like to see the heart of the machine that
ran those original versions of Gofer, come visit me; it's sitting
here on the desk lamp in my office :-)

All the best,
Mark