[Haskell-cafe] hs-plugins and memory leaks

Wed Oct 20 17:13:48 EDT 2010

I was happy to see the recent announcement about hs-plugins being
updated to work with newer ghc.  I have a project and had always been
planning to use it.

However, there are some questions I've had about it for a long time.
The 'yi' paper mentions both 'yi' and 'lambdabot' as users of
hs-plugins.  However, both those projects have long since abandoned
it.  I can't find any documentation on why, or even any documentation
at all for Yi wrt its dynamic code execution system, but from looking
at the source it looks like it uses hint for dynamic code execution
and dyre for configuration.  Dyre in turn uses serialization to pass
the old state to the reconfigured app.  So we have retreated from the
idea of hotswapping the application state.

It seems to me that the advantages as put forth in the 'yi' paper
still hold.  Changing the configuration in yi is rather heavyweight.
Relinking the entire editor takes a long time, and yi is still a
relatively small program.  Editors can keep most of their state on
disk and can have very simple GUI state, so perhaps the serialization
and deserialization isn't such a problem, but this doesn't hold for
other programs.  It seems to me the loss is significant: there's a big
difference between being able to experiment with a command by editing
and rerunning it immediately, and having to wait 10s or more for the
app to recompile, relink, shut down the ui, serialize all state, and
restart.  And if you add hint, you are linking in large parts of ghc,
with an even slower link time.  So, yi is no longer a dynamically
reconfigurable application, and is now merely a configurable
application.

The apparent loss of such a useful feature (you might even say a
defining feature) would presumably only happen if keeping it was
untenable.  And of course that makes me reluctant to make any kind of
design that relies on it without first knowing why all existing users
jumped ship.

I can think of one possible reason, and that's a memory leak.  In
ghc/rts/Linker.c:unloadObj there's a commented out line '//
stgFree(oc->image);'.  In a test program I wrote that behaves like
'plugs', every executed line increases the size of the program by
12-16k.  I have to remove the resolveObjs call from plugs for it to
work, but once I do it displays the same leak.

So my questions are:

Why did lambdabot and yi abandon plugins?

Is unloadObj a guaranteed memory leak?  As far as I can tell, it's
never called within ghc itself.  If the choices are between a memory
leak no matter how you use it and dangerous but correct if you use it
right, shouldn't we at least have the latter available as an option?
E.g. a reallyUnloadObj function that also frees the image.

If I uncomment that line will it fix the problem?  Is it safe to do so
if I first force all thunks that might contain unloaded code?

Long shot, but are there any more principled ways to guarantee no
pointers to a chunk of code exist?  The only thing I can think of is
to have the state be totally strict and consist only of types from the
static core.  Would it be possible to hand responsibility for the
memory off to the garbage collector?

GHC now supports dynamic libraries.  Given that plugins may need to
link large portions of the static core "library", can it be loaded as
a dynamic library so both the core and the plugins can share the same
code?  I haven't been able to find many references to ghc's support
for dynamic linking.