Static values language extension proposal

Sun Jan 26 18:43:41 UTC 2014

On 25 Jan 2014, at 18:12, Carter Schonwald wrote:

> 1) you should (once 7.8 is out) evaluate how far you can push your ideas wrt dynamic loading as a user land library.
>  If you can't make it work as a library and can demonstrate why (or how even though it works its not quite satisfactory), thats signals something!  
> 

Is that something you'll consider looking at Matthieu?

>  Theres quite a few industrial haskell shops that provide products / services where internally they do runtime dynamic loading of user provided object files, so i'm sure that the core GHC support is there if you actually dig into the apis! And they do this in a distributed systems context, sans CH.
> 

We have a pull request from Edsko that melds hs-plugins support with static, as per the original proposal's notes, so this seems like a corollary issue to me. 

> 2) I've a work in progress on specing out a proper (and sound :) ) static values type extension for ghc, that will be usable perhaps in your your case (though by dint of being sound, will preclude some of the things you think you want). BUT, any type system changes need to actually provide safety. My motivation for having a notion of static values comes from a desire to add compiler support for certain numerical computing operations that require compiler support to be usable in haskell. BUT, much of the same work 
> 

Timescales? There are commercial users of Cloud Haskell clamouring for improvements to the way we handle this situation, and I'm keen to combine getting broader community agreements about "the right thing to do" with facilitating our users real needs. If there are other options pertaining to "static" support, I'd like to know more!

> @tim: what on earth does "sending arbitrary code" mean? I feel like the more precise thing everyone here wants is "for a given application / infrastructure deployment, I would to be able to send my application specific computations over the network, using cloud haskell, and be sure that both sides think its the same code".
> 

With Cloud Haskell in its current guise, I can "Closure up" pretty any thunk I like and spawn it on a remote node. If the node's are both running the same executable, we're fine. If they're not, we're potentially in trouble.

In Erlang, I can rpc/send *any* term and evaluate it on another node. That includes functions of course. Whether or not we want to be quite that general is another matter, but that is the comparison I've been making.

> As for *how* to send an AST fragment, edward kmett and other have some pretty nice typed AST models that are easy to adapt and extend for an application specific use case. Bound http://hackage.haskell.org/package/bound is one nice one. 
> 
> heres a really really good school of haskell exposition https://www.fpcomplete.com/user/edwardk/bound
> 
> And theres a generalization that supports strong typing that i've copied from an hpaste https://gist.github.com/cartazio/5727196, where its notable that the AST data type is called "Remote" :),
> I think thats a hint its meant to be a haskell manipulable way of constructing a typed DSL you can serialize using a finally tagless style api approach (ie have a set of type class instances / operations that you use to run the computation and/or construct the AST you can send over the wire)
> 

These are all lovely, but aren't we talking about either (a) putting together an AST to represent whatever valid Haskell program someone wants to send, or (b) forcing every application developer to write an AST to cover all their remote computations. Both of those sound like a lot more work than the proposal below. They may be the right approach from some domains, but there is a fair bit of "developer overhead" involved from what I can see.

> On Fri, Jan 24, 2014 at 3:19 PM, Mathieu Boespflug <0xbadcode at gmail.com> wrote:
> The `static e` form could as well be a piece of Template Haskell, but
> making it a proper extension means that the compiler can enforce more
> invariants and be a bit more helpful to the user. In particular,
> detecting situations where symbolic references cannot be generated
> because e.g. the imported packages were not compiled as dynamic linked
> libraries. Or seamlessly supporting calling `static f` on an idenfier
> `f` that is not exported by the module.
> 

All of which sound like a usability improvement to me.

> I very much subscribe to the idea of defining small DSL's for
> exchanging code between nodes. And this proposal is compatible with
> that idea.
> 
> One thing that might not have been so clear in the original email is
> that we are proposing here to introduce just *one such DSL*. It's just
> that it's a trivial one whose grammar only contains linker symbol
> names.
> 

That triviality is a rather important point as well, because...

> As it happens, distributed-static today already supports two such
> DSL's: a DSL of labels, which are arbitrary string names for
> functions, and a small language for composing Static values together.

And whilst those two DSL's are rather simple, it can still be tricky to get things right. 

> As Facundo explains at the end of his email, the notion of a "static"
> value ought to be a more general one than was first envisioned in the
> paper: a static value is any closed denotation, denoted in any of a
> choice of multiple small languages, some of which ship standard with
> distributed-static. The user can define his own DSL for shipping code
> around.

Indeed - there's never been anything preventing users from doing thus. Indeed, sending messages that are "interpreted" by a remote processes in order to apply some specific processing is pretty much the MO of all Cloud Haskell code. The "plugins" based support will add to the options there.

> > 2) how does it provide more type safety than the current TH based approach?
> > (I've seen Tim and others hit very very gnarly bugs in cloud haskell based
> > upon the "magic static values" approach).
> 
> The type safety of the current TH approach is reasonable I think. One
> potential problem comes from managing dynamically typed values in the
> remote table, which must be coerced to the right type and use the
> right decoders if you don't use TH. With the approach we propose,
> there is no remote table, so I guess this should help eliminate a
> source of bugs.

And remove a slightly awkward programming model. 

> 
> > to repeat: have you considered defining an AST type + interpreter for the
> > computations you want to send around, and doing that? I think its a much
> > simpler, safer, easier, flexible and PORTABLE approach, though one current
> > CH doesn't do (though the folks working on CH seem to be receptive to
> > switching to such a strategy if someone validates it)
> 
> We have, and it's an option with different tradeoffs. Both solutions
> could gainfully live side by side and are in fact complementary. I
> contend that the solution described by Facundo has the advantage of
> eliminating much of the syntactic overhead associated with sending
> references to (higher-order) values across the cluster. We have more
> ideas specific to distributed-process which we can discuss in a
> separate thread to reduce the syntactic overhead even further, to
> practically nothing.
> 

I agree that the proposal sounds beneficial. It's a good thing that both approaches can live side by side. 

I'd like to hear more about these other ideas too. I'd also like to hear more from the rest of the community - especially Cloud Haskell users. I know a few others besides Parallel Scientific are using Cloud Haskell in commercial applications - I'd very much like to hear from you all on this proposal too.

Cheers,
Tim

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20140126/c752800b/attachment-0001.html>