[C2hs] precompiled headers with c2hs

Axel Simon A.Simon at kent.ac.uk
Sat Oct 2 10:50:02 EDT 2004


Good afternoon,

at

http://www.cs.kent.ac.uk/people/staff/as49/c2hsPatched.tgz

you find a patched version 0.13.1 of c2hs that supports precompiled
headers. While parsing and name analysis takes as much time as it did
before, writing and reading pre-compiled header information is quick.
Hence converting several files that are based on the same header file is
much quicker because the header is precompiled once and the precompiled
header is used several times for each .chs module.

AFAIK Manuel's main concern was that it doesn't fit with generating code 
and conditionals in his newer versions of c2hs since these .chs modules 
would output a bit of C which has to be translated together with the 
header file. I took a pragmatic approach to solve this. Suppose you have 
the header file calls.h and the binding file Calls.chs. The normal way of 
translating is

c2hs call.h Calls.chs

or with the newer verions being able to peek into the .chs file to figure 
out what header file to use you can simply say

c2hs Calls.chs

Most binding modules (at least those modules I am interested in) do not
have conditionals or produce C declarations. For those you can now say

c2hs --precomp=calls.h.precomp calls.h

giving you a precompiled verison of calls.h. As a second step you say

c2hs --precomp=calls.h.precomp Calls.chs

which translates Calls.chs using the precompiled header file 
calls.h.precomp. You may also say

c2hs --precomp=calls.h.precomp calls.h Calls.chs

which generates the precompiled header and expands the module Calls.chs 
using the precompiled header.  
In case Calls.chs contains any C declarations that need to be parsed, the
whole header file is parsed as normal (like in the first example) and the
precompiled header information is written, but not used.

There are a couple of things that need to be addressed:

- the precompiled header files are quite large (about 20Mb for Gtk 2) 
which is mostly due to the file names in identifiers, which are written 
out for every identifier. Ideally a new string table should be added to 
the global monad and the String in the file name attribute should be 
replaced by a UName, which is an index into a string table. The string 
table of file names can then be written separately and every file name 
only appears once. I don't understand enough about the structure and the 
monads in c2hs to estimate how difficult this change is, so I'd rather 
leave this up to Manuel.

- I applied my ForeignPtr patch we use for gtk2hs to correct argument 
passing of ForeignPtrs which have to be unwrapped and passed as Ptrs. 
This fix is still broken for newtypes of Ptrs and types without a newtype 
wrapper. The fix should not be too hard.

- I hope the improvement that Duncan found in the token conversion is not 
yet applied to Manuel's version 0.13.1 since the mentioned version is 
incredebly slow. This has nothing to do with the precompilation stuff, it 
was that slow before.

- To read files lazily I used memory-mapped read-only files (which are 
like a big constant). They work great but are not very portable. I put the 
non-portable bits in SysDepGHC6.hs but haven't bothered to write this code 
for other compilers and versions.

Let me know what you think,

Axel.



More information about the C2hs mailing list