[C2hs] Re: support for 6.4

Sun May 22 03:23:59 EDT 2005

Am Donnerstag, den 19.05.2005, 14:32 +0100 schrieb Duncan Coutts:
> On Thu, 2005-05-19 at 16:26 +1000, André Pang wrote:
> > On 19/05/2005, at 12:15 AM, Gour wrote:
> > 
> > > btw, gtk2hs devs have problem with space leaks in c2hs, ie. one
> > > requires over 1GB of RAM to process gtk2 headers.

First of all, I am not convinced that we are having a space leak in
c2hs.  Let's look at what c2hs does.  It runs cpp over a header, which
for GTK+ gives one enormous file with C declarations.  c2hs needs to
read the whole thing, as due to the nature of C, it is impossible to
judge a priori which declarations are relevant for the binding at hand.

This just needs a lot of space.  It is probably possible to come up with
a more efficient representation of the AST, but that would probably be
quite some work to implement.

> Our considered opinion (Axel and myself) is that c2hs's memory
> consumption is a very difficult thing to fix and any fix we might be
> able to come up with would likely be very invasive and so Manuel would
> not be very keen on the idea.

I don't mind it being invasive if

(1) it is not gtk2hs specific (ie, it must be generally useful) and
(2) and doesn't conflict with other features and/or the basic structure.

> One approach I havn't tried but might bear some fruit is to check that
> c2hs is actually using all the data it collects, or if in fact much of
> the AST goes unused in which case it could be eliminated. However I
> don't imagine that this would give any enourmous savings (ie enough to
> process the Gtk 2.x headers on a machine with 256Mb or RAM).

I am sure lots of the AST isn't used, but we won't know until after most
work is done.  To do it's work, c2hs needs all declarations on which any
symbols bound from Haskell directly or indirectly depend.

> I do have another idea however which I would like to get some feedback
> upon...
> 
> Basically the idea is that we want to to only run c2hs on the developers
> machine and distribute the resulting .hs files. That way only the
> developers machines need 1Gb of RAM. But we also want the resulting .hs
> files to be portable. Portable both between architectures and between
> different versions of Gtk+. For the architecture independence all we
> have to do is make sure we are not using the c2hs {# get #} {# set #}
> features since they embed field offsets into the .hs file which is not
> portable. This is not a great hardship for us since we mostly use hsc2hs
> for doing structure access and c2hs for calling functions.
> 
> Having the .hs files work with different Gtk+ versions is more tricky.
> The idea here is to run cpp on the .hs files after running c2hs, so we
> distribute the .hs files output from c2hs and run cpp over them on the
> target machine when we know what version of Gtk+ we are targeting. For
> this to work we need to run c2hs with the latest version of Gtk+ (since
> it has to be a superset of all versions we intend the .hs files to
> support) and we need to have c2hs pass the preprocessor directives
> through to the .hs files.
> 
> In fact it is slightly more complicated that this. We can't just have
> c2hs ignore the cpp directives since then the problem would be that the
> foriegn import declarations that c2hs adds at the end of the .hs file
> would not be in the context of the cpp directives where the {# call #}
> was used. So to fix this problem I have hacked up a patch such that the
> cpp context is output along with the foreign import declarations. This
> takes advantage of the existing feature in c2hs where it interprets the
> cpp directives, but this uses it for a different purpose. I don't think
> this approach is too invasive, it is cartainly much less so than our
> existing precompiled headers patch or any proposed heap reduction
> strategy.
> 
> I do not yet know if this approach will work fully, I'm still asessing
> its feasability. If it does turn out to be a workable approach, I'd be
> keen to discuss with Manuel wether he might accept such a feature
> (controlled by some command line flag) into the main c2hs.

I don't know what you mean by the cpp context.  Moreover, I would like a
clear story on what cpp directives are passed through and what are
interpreted.

What I don't like about this approach is that it is to an extent gtk2hs
specific.  Let me explain.  Not needing c2hs on user machines would be a
Good Thing.  Supporting this only for the subset of features used by
gtk2hs (ie, no set and get hooks) is bad.

Manuel