[C2hs] precompiled headers with c2hs
Axel Simon
A.Simon at kent.ac.uk
Sat Oct 2 10:50:02 EDT 2004
Good afternoon,
at
http://www.cs.kent.ac.uk/people/staff/as49/c2hsPatched.tgz
you find a patched version 0.13.1 of c2hs that supports precompiled
headers. While parsing and name analysis takes as much time as it did
before, writing and reading pre-compiled header information is quick.
Hence converting several files that are based on the same header file is
much quicker because the header is precompiled once and the precompiled
header is used several times for each .chs module.
AFAIK Manuel's main concern was that it doesn't fit with generating code
and conditionals in his newer versions of c2hs since these .chs modules
would output a bit of C which has to be translated together with the
header file. I took a pragmatic approach to solve this. Suppose you have
the header file calls.h and the binding file Calls.chs. The normal way of
translating is
c2hs call.h Calls.chs
or with the newer verions being able to peek into the .chs file to figure
out what header file to use you can simply say
c2hs Calls.chs
Most binding modules (at least those modules I am interested in) do not
have conditionals or produce C declarations. For those you can now say
c2hs --precomp=calls.h.precomp calls.h
giving you a precompiled verison of calls.h. As a second step you say
c2hs --precomp=calls.h.precomp Calls.chs
which translates Calls.chs using the precompiled header file
calls.h.precomp. You may also say
c2hs --precomp=calls.h.precomp calls.h Calls.chs
which generates the precompiled header and expands the module Calls.chs
using the precompiled header.
In case Calls.chs contains any C declarations that need to be parsed, the
whole header file is parsed as normal (like in the first example) and the
precompiled header information is written, but not used.
There are a couple of things that need to be addressed:
- the precompiled header files are quite large (about 20Mb for Gtk 2)
which is mostly due to the file names in identifiers, which are written
out for every identifier. Ideally a new string table should be added to
the global monad and the String in the file name attribute should be
replaced by a UName, which is an index into a string table. The string
table of file names can then be written separately and every file name
only appears once. I don't understand enough about the structure and the
monads in c2hs to estimate how difficult this change is, so I'd rather
leave this up to Manuel.
- I applied my ForeignPtr patch we use for gtk2hs to correct argument
passing of ForeignPtrs which have to be unwrapped and passed as Ptrs.
This fix is still broken for newtypes of Ptrs and types without a newtype
wrapper. The fix should not be too hard.
- I hope the improvement that Duncan found in the token conversion is not
yet applied to Manuel's version 0.13.1 since the mentioned version is
incredebly slow. This has nothing to do with the precompilation stuff, it
was that slow before.
- To read files lazily I used memory-mapped read-only files (which are
like a big constant). They work great but are not very portable. I put the
non-portable bits in SysDepGHC6.hs but haven't bothered to write this code
for other compilers and versions.
Let me know what you think,
Axel.
More information about the C2hs
mailing list