[Haskell-cafe] some algorithm help with jhc

Tue Sep 13 18:30:01 EDT 2005

I have started working on jhc more recently and have come across some
places where I think my algorithms could be improved but was not sure
exactly where to start so thought I would ask the list since perhaps
someone here has some insight.

After a long time of trying various methods of speeding up the fixpoint
iteration of my points-to analysis (the current main bottleneck) I
decided to step back and look at the basic problem again. It turns out I
can express the problem as one of constraint satisfaction resulting in
much smaller code (600 lines vs 2000) and 10fold speedups with my
unoptimized first draft solver.

It is much faster but still not as fast as I'd like. I don't know a lot
about constraint problems, but my intuiton says this particular problem
is of a type that should be particularly easy to solve but am uncertain
where to start in my searching to find a fast algorithm. My constraints
come in two types of rules.

the rules are of the form (where a and b are variables to be solved for and x and y are values in my domain)

 * a >= f(b)  where x >= y implies f(x) >= f(y)
 * enforce g(a) as a new set of rules  where if x >= y then g(x) subsumes g(y)

f and g are typed thusly.
f :: domain -> domain
g :: domain -> set of rules

the following useful derived rules can be expressed by the above.
a >= b
a == b
if z(a) then b >= c

now, the reason my intuiton says there should be a fast solution is that
there is no way for the variables to decrease. every added rule can only
add to the least solution and not shrink any set.

so, my basic question is, is this a known form of constraint
satisfaction problem? does it admit a particulary fast solution as my
intuition tells me? does it have a name I can search for on
scholar.google.com?

my current first draft implementation represents each rule as an IO
action taking a difference set that propegates said difference set and
then adds it to the current set for each var.

The second problem I am facing is one of debugging. my code dealing with
jhc core and all the optimizations that are performed on it has gone
through a lot of evolution. over time, bugs have been introduced that
are very hard to track down, the symptom would be suddenly having core
that doesn't typecheck or has an unknown identifier or worse a segfault
in the generated program. backtracking to find the error is quite
tedious, mainly involving commenting out transformations until I find
the offending one, then rederiving the correct code and comparing it to
what I have written. 

I have decided to remedy the situation and start using QuickCheck to
verify all my transformations are meaning and type preserving. however,
the problem arrises in how to come up with a meaningful instance of
Arbitrary for expressions in jhc core. It is not clear at all how to
come up with code that generates random yet "interesting", well typed,
convergent, terms in the extended lambda calculus with things like
recursive definitons and primitives. even if I solved that, the problem
of deciding whether two expressions (which might be functions) do the
'same thing' is undecidable, so how do I even test if the
transformations are meaning preserving?  Ideally, someone would have
written a paper on this. I have seen several papers on generating
suitable random graphs for testing graph algorithms, but have not come
across one on creating typed lambda calculus terms. perhaps someone else
has come across this same problem and has some insights?

I am interested in ideas, brainstorming and pointers to papers or terms
to search for as much as ready made solutions. In any case, I think they
are interesting problems to begin with... hopefully someone out there
thinks so too :)

        John

-- 
John Meacham - ⑆repetae.net⑆john⑈