[Haskell-cafe] A distributed and replicating native Haskell database

Fri Feb 2 10:20:34 EST 2007

On Feb 2, 2007, at 3:06 PM, Paul Johnson wrote:

> As a rule, storing functions along with data is a can of worms.  
> Either you actually store the code as a BLOB or you store a pointer  
> to the function in memory. Either way you run into problems when  
> you upgrade your software and expect the stored functions to work  
> in the new context.

ACache does not store code in the database. You cannot read the  
database unless you have your original class code. ACache may store  
the "schema", i.e. the parent class names, slot names, etc.

> Erlang also has a very disciplined approach to code updates, which  
> presumably helps a lot when functions are stored.

No storing of code here either. What you store in Erlang is just  
tuples so there's no schema or class definition. No functions are  
stored since any Erlang code can fetch the tuples from Mnesia. You do  
need to have the original record definition around but this is just  
to be able to refer to tuple elements with field names rather name  
field position.

> I very much admire Mnesia, even though I'm not an Erlang  
> programmer. It would indeed be really cool to have something like  
> that. But Mnesia is built on the Erlang OTP middleware. I would  
> suggest that Haskell needs a middleware with the same sort of  
> capabilities first. Then we can build a database on top of it.

Right. That would be a prerequisite.

> The real headache is type safety. Erlang is entirely dynamically  
> typed, so untyped schemas with column values looked up by name at  
> run-time fit right in, and its up to the programmer to manage  
> schema and code evolution to prevent errors. Doing all this in a  
> statically type safe way is another layer of complexity and checking.

I believe Lambdabot does schema evolution.

> Alternatively the protocol can be defined in a special purpose  
> protocol module P, and A and B then import P. This is the approach  
> taken by CORBA with IDL. However what happens if P is updated to  
> P'? Does this mean that both A and B need to be recompiled and  
> restarted simultaneously? Requiring this is a Bad Thing; imagine if  
> every bank in the world had to upgrade and restart its computers  
> simultaneously in order to upgrade a common protocol.

I would go for the middle ground and dump the issue entirely. Lets be  
practical here. When a binary protocol is updated, all code using the  
protocol needs to be updated. This would be good enough. It would  
suite me just fine too as I'm not yearning for CORBA, I just want to  
build a trading infrastructure entirely in Haskell.

> There is still the possibility of a run-time failure at the  
> protocol negotiation stage of course, if it transpires that the to  
> processes have no common protocol.

So no protocol negotiation!

> However there is a wrinkle here: what about "pass through"  
> processes which don't interpret the data but just store and forward  
> it. Various forms of protocol adapter fit this scenario, as does  
> the database you originally asked about.

Any packet traveling over the wire would need to have a size,  
followed by a body. Any pass-through protocol can just take the  
binary blob and re-send it.

	Thanks, Joel

--
http://wagerlabs.com/