Static values language extension proposal

Mathieu Boespflug 0xbadcode at gmail.com
Fri Jan 24 20:19:04 UTC 2014


[Sorry for the multiple reposts - couldn't quite figure out which
email address doesn't get refused by the list..]


Hi Carter,

thank you for the good points you raise. I'll try and address each of
them as best I can below.

> 0) I think you could actually implement this proposal as a userland library,
> at least as you've described it. Have you tried doing so?

Indeed, this could be done without touching the compiler at all. We
thought long and hard about a path that would ultimately make an
extension either unnecessary, or at any rate very small. At this
point, the only thing that we are proposing to add to the compiler is
the syntactic form "static e". Contrary to the presentation in the
paper, the 'unstatic' function can be implemented entirely as library
code and does not need to be a primop. Moreover, we do not need to
piece together any kind of global remote table at compile time or link
time, because we're piggy backing on that already constructed by the
system linker.

The `static e` form could as well be a piece of Template Haskell, but
making it a proper extension means that the compiler can enforce more
invariants and be a bit more helpful to the user. In particular,
detecting situations where symbolic references cannot be generated
because e.g. the imported packages were not compiled as dynamic linked
libraries. Or seamlessly supporting calling `static f` on an idenfier
`f` that is not exported by the module.

> 1) what does this accomplish that can not be accomplished by having various
> nodes agree on a DSL, and sending ASTs to each other?
>      1a) in fact, I'd argue (and some others agree, and i'll admit my
> opinions have been shaped by those more expert than me) that the sending a
> wee AST you can interpret on the other side is much SAFER than "sending a
> function symbol thats hard coded hopefully into both programs in a way that
> it means the same thing".

I very much subscribe to the idea of defining small DSL's for
exchanging code between nodes. And this proposal is compatible with
that idea.

One thing that might not have been so clear in the original email is
that we are proposing here to introduce just *one such DSL*. It's just
that it's a trivial one whose grammar only contains linker symbol
names.

As it happens, distributed-static today already supports two such
DSL's: a DSL of labels, which are arbitrary string names for
functions, and a small language for composing Static values together.
There is a patch lying around by Edsko proposing to add a third "DSL":
one that allows nodes to trade arbitrary Haskell strings that are then
eval'ed on the other end by the 'plugins' package.

As Facundo explains at the end of his email, the notion of a "static"
value ought to be a more general one than was first envisioned in the
paper: a static value is any closed denotation, denoted in any of a
choice of multiple small languages, some of which ship standard with
distributed-static. The user can define his own DSL for shipping code
around.

This is why we propose to make Static into a class. Each DSL is
generated by one datatype. Each such datatype has a Static instance.
If you would like to ship an AST around the cluster, you can make the
datatype for that AST an instance of Static, with 'unstatic' being
defined as an interpreter for your AST.

Concretely:

data HsExpr = ...

instance Static HsExpr where
  unstatic e = Hs.interpret e

> I've had many educational conversations with

... ?

> 2) how does it provide more type safety than the current TH based approach?
> (I've seen Tim and others hit very very gnarly bugs in cloud haskell based
> upon the "magic static values" approach).

The type safety of the current TH approach is reasonable I think. One
potential problem comes from managing dynamically typed values in the
remote table, which must be coerced to the right type and use the
right decoders if you don't use TH. With the approach we propose,
there is no remote table, so I guess this should help eliminate a
source of bugs.

> 3) this proposal requires changes to linking etc that would really make it
> useful only on systems and deployments that only have Template Haskell AND
> Dynamic linking.  (and also rules out any context where it'd be nice to
> deploy a static app or say, use CH in ios! )

I don't know about iOS. And it's very likely that there are contexts
in which this extension doesn't work. But as I said above, you are
always free to define your own DSL's that cover the particular use
case that you have in mind. The nice thing with this particular DSL is
that it requires little to no TH to generate label names, which can
always be a source of bugs, especially when you forget to include them
in the global remote table (which is something that TH doesn't and
can't help you with).

Furthermore, it was my understanding that GHC is heading towards a
world of "dynamic linkable by default", and it is by now something
that is supported on most platforms by GHC. See e.g.

https://ghc.haskell.org/trac/ghc/wiki/DynamicGhcPrograms

There are fairly good solutions to deploy self contained dynamically
linked apps these days, e.g. Docker. And in any case, with a few extra
flags we can still do away with the dynamic linking requirement on
some (all?) platforms.

> to repeat: have you considered defining an AST type + interpreter for the
> computations you want to send around, and doing that? I think its a much
> simpler, safer, easier, flexible and PORTABLE approach, though one current
> CH doesn't do (though the folks working on CH seem to be receptive to
> switching to such a strategy if someone validates it)

We have, and it's an option with different tradeoffs. Both solutions
could gainfully live side by side and are in fact complementary. I
contend that the solution described by Facundo has the advantage of
eliminating much of the syntactic overhead associated with sending
references to (higher-order) values across the cluster. We have more
ideas specific to distributed-process which we can discuss in a
separate thread to reduce the syntactic overhead even further, to
practically nothing.

Best,

Mathieu


More information about the Glasgow-haskell-users mailing list