GHC and C++ (was RE: Creating COM objects and passing out pointers to them via a COM interface)

Tue, 4 Feb 2003 10:34:20 -0000

I'm replying to two threads at the same time and cross posting my reply,
because they are very relevant to each other. I'm sorry if anyone here ends
up seeing this more than once as a consequence.

[s]

------
> Sarah
>
> Did you get this problem sorted out?

Not directly. I ended up building an 'outer' COM component in C++ that
mapped an MSXML-like interface onto a simpler, internal COM component that
wraps up the Haskell code. This approach seems to work well - I might have
been able to build the necessary wrapper directly in Haskell using HDirect,
but I was becoming aware that I was probably starting to take up an
unreasonable amount of Sigbjorn's time with increasingly detailed questions
and I didn't have time to figure out the necessary intricacies from scratch.

HDirect seems very good, but I don't think it's currently in quite as stable
a state as the main GHC release. The problems I had in building a version
that works with the current GHC are an example of this. I'm not
complaining - it worked well in the end, but I do think it might be a 'good
thing' to consider rolling HDirect into the main GHC release at some point.

The other thing that bit me was exception handling, or rather the default
behaviour when an uncaught exception 'falls out' and reaches the top level
of the runtime system. Currently, the functionality appears to be to abort
the current process. Whilst this is probably desirable for a typical
application executable (be that console or GUI based), it is a problem if it
happens inside a COM component. My thinking is quite straightforward on
this. Preferred behaviour should really be for a call to a COM component to
fail returning an error code, rather than abort the calling process.
'Server-style' applications are usually expected to keep running when minor
faults occur, rather than terminate. HDirect/GHC's current behaviour makes
this difficult to achieve - I managed to successfully catch exceptions, but
the price was needing to use some code based on DeepSeq to force deep strict
evaluation for all state updates. This (seemingly) was the only way I could
stop exceptions from happening in the context of the automatically generated
COM wrapper - passing it a fully evaluated data structure was apparently the
only solution available. However, this does break the advantages of laziness
as applied to the current state of the object - it would be pretty neat, for
example, to be able to define a view of a database as the result of an
expression involving relational algebra combinators, with the query
evaluated one record at a time as the data is pulled back by the client
application. The need for DeepSeq forces strict evaluation, so this benefit
is lost. This is actually more of an issue than it might immediately seem -
many queries don't actually need to retrieve all of a record set before
useful things can be done. A particular example is updating a GUI, where a
list box reflects the results of a query. Lazy evaluation will allow the
screen to start updating more or less immediately even on a query that
returns thousands of records - strict evaluation might incur a delay of a
few seconds (for a very complex query or a large database) before updates
can start happening.

In an ideal world, what I'd dearly love to see would be an HDirect option
that allows a choice between existing functionality and having calls return
a failure code, or better still allowing a user-supplied handler function to
determine the correct course of action.

So, to sum up my wish list:

1. A facility to replace the default uncaught exception behaviour for COM
objects and similar shared library code

2. HDirect rolled into the main GHC tree

3. A facility within HDirect to create COM objects based on existing
(Haskell) data structures, returning an interface pointer as a return
parameter on a call to an existing COM object. (This was the question Simon
asked if I'd had an answer to - I've attached a copy below for reference).

Sarah

PS: I've not yet looked closely at the .NET integration in GHC yet, but I
can imagine myself having a very similar wish list in that case too.

--------
Hi again,

A slightly harder problem this time. I want to implement something that
looks something like the Microsoft XML parser. This is a very COM heavy
contraption that provides an interface that looks to client applications
like a tree of COM objects that reflects the internal structure of an XML
document.

The idea is, you start with the document object, make a method call
requesting the root node, and have a COM object that wraps that node
returned to you. You can then request the children of that node, and their
children and siblings, all which involves being passed back interface
pointers to COM objects. Effectively, there is one COM object per syntactic
item in the XML document.

I already have my document object working nicely. This is where the actual
database resides. I want to be able to spawn views of this database as COM
objects. When I've done this in the past (using ATL mostly, euuuugh), I've
used an IDL definition something like

   HRESULT CreateView([in,string] BSTR query, [out,retval] IDbView **);

What's the correct way to implement the necessary Haskell code to create the
coclass and return a pointer to it? I fiddled around for ages, but either
couldn't get GHC to like the types I was using, or when I finally got it to
compile, I found that my test harness was being passed an invalid COM
object. This should be simple stuff, so I'm sure someone will have done it!

Secondly, it is a nasty, though fairly common trick, if you know your
objects are going to be created in the same process space, to allow them
some 'behind the scenes' communication. In my case, I'd like to avoid
needing to serialise and deserialise the data in a view during creation of
these child objects (of which there may be a great many, and which may be
created on the fly rather often). Ideally, I'd like to be able to do
something like:

newView :: String -> State -> IO State
newView db (State st) = do
    db <- readIORef st
    r <- newIORef db
    -- Insert some kind of COM magic here
    return (State r)

so the database content gets copied within the Haskell environment (lazily,
I'd hope), without needing to be serialised and deserialised. Is this
feasible? One (messy) approach might be to create a bunch of extra
interfaces on the main database object, then create 'thin wrapper' child
objects that simply make calls back into the parent object. I'm not so keen
on that for various reasons, however.

Sarah