Static values language extension proposal

Tue Jan 28 12:53:33 UTC 2014

Hi Carter, Tim,

On Sat, Jan 25, 2014 at 7:12 PM, Carter Schonwald
<carter.schonwald at gmail.com> wrote:
> anyways
>
> 1) you should (once 7.8 is out) evaluate how far you can push your ideas wrt
> dynamic loading as a user land library.
>  If you can't make it work as a library and can demonstrate why (or how even
> though it works its not quite satisfactory), thats signals something!

Signals what?

On Sun, Jan 26, 2014 at 7:43 PM, Tim Watson <watson.timothy at gmail.com> wrote:
> Is that something you'll consider looking at Matthieu?

We would prefer to do it that way, to be honest. As explained in my
previous email, we identified two problems with this approach:

1) User friendliness. It's important for us that Cloud Haskell be
pretty much as user friendly and easy to use as Erlang is.

    a) I don't know that it's possible from Template Haskell to detect
and warn the user when dependent modules have not been compiled into
dynamic object code or into static code with the right flags.

    b)  It's very convenient in practice to be able to send not just
`f` if `f` is a global identifier, but in general `e` where `e` is any
closed expression mentioning only global names. That can easily be
done by having the compiler float the expression `e` to the top-level
and give it a global name. I don't see how to do that in TH in a user
friendly way.

2) A technical issue: you ought to be able to send unexported
functions across the wire, just as you can pass unexported functions
as arguments to higher-order functions. Yet GHC does not create linker
symbols for unexported identifiers, so our approach would break down.
Worse, I don't think that it's even possible to detect in TH whether
an identifier is exported or not, in order to warn the user. One could
imagine a compiler flag to force the creation of linker symbols for
all toplevel bindings, exported or unexported. But that seems
wasteful, and potentially not very user friendly.

If the above can be solved, all the better!

If not: we don't always want to touch the compiler, but when we do,
ideally it should be in an unintrusive way. I contend our proposal
fits that criterion. And our cursory implementation efforts seem to
confirm that so far.

> But I really think insisting that the linker symbol names denote the "datum
> agreement" in a distributed system is punting on what should be handled at
> the application level. Simon Marlow put some improvements into GHC to help
> improve doing dynamic code (un)loading, stress test that!

We could use either the system linker or rts linker. Not sure that it
makes any difference at the application level.

> 2) I've a work in progress on specing out a proper (and sound :) ) static
> values type extension for ghc, that will be usable perhaps in your your case
> (though by dint of being sound, will preclude some of the things you think
> you want).

I look forward to hearing more about that. How is the existing
proposal not (type?) sound?

> BUT, any type system changes need to actually provide safety.

To be clear, this proposal doesn't touch the type checker in any way.

> As for *how* to send an AST fragment, edward kmett and other have some
> pretty nice typed AST models that are easy to adapt and extend for an
> application specific use case. Bound
> http://hackage.haskell.org/package/bound is one nice one.
>
> heres a really really good school of haskell exposition
> https://www.fpcomplete.com/user/edwardk/bound

These are nice encodings for AST's. But they don't address how to
minimize the amount of code to ship around the cluster. If you have no
agreement about what functions are commonly available, then the AST
needs to include the code for the function you are sending, + any
functions it depends, + any of their dependencies, and so on
transitively.

Tim, perhaps the following also answers some of your questions. This
is where the current proposal comes in: if you choose to ship around
AST's, you can minimize their size by having them mention shared
linker symbol names. Mind, that's already possible today, by means of
the global RemoteTable, but it's building that remote table safely,
conveniently, in a modular way, and with static checking that no
symbols from any of the modules that were linked at build time were
missed, that is difficult.

By avoiding a RemoteTable entirely, we avoid having to solve that
difficult problem. :)

Best,

-- 
Mathieu Boespflug
Founder at http://tweag.io.