Marshalling functions was: Transmitting Haskell values
Alastair Reid
alastair at reid-consulting-uk.ltd.uk
Wed Oct 29 12:15:31 EST 2003
> Is [marshaling functions] something absolutely impossible in
> Haskell and by what reason? Just because of strong typing (forgive my
> stupidity ;)? Or are there some deeper theoretical limitations?
The big theoretical issue is whether it would provide an Eq or Show instance
for -> by the backdoor. Careful API design could avoid the worst of this.
A semi-theoretical issue is to do with sharing of values, libraries and types.
A simple function like '\ x -> x' is fairly easy to write out because it
doesn't refer to any other objects but what about:
\ x -> head [ y | (x',y) <- table, x == x' ]
This refers to an object 'table' that will have to be
accessible from the receiver. Choices are to access the
object over the network, to copy the object or to move the
object and access it remotely from the sender.
Remote access requires a network connection and suddenly
adding two numbers can raise a TCP/IP exception.
Copying has consequences for mutable datatypes (a useful but
non-standard extension of Haskell), for foreign datatypes
(Ptr, etc.) and for interfaces to foreign functions.
Copying also has consequences for space and time usage: each
copy uses more space and laziness is lost.
If an object is copied:
1) Is it evaluated first?
2) Can it contain mutable objects?
3) Can it contain Foreign pointers?
4) Can it contain functions?
5) Is sharing preserved within the object?
e.g., does let x=1; y=(x,x,x) in (y,y,y) get sent as 3 objects or
as 13 objects?
6) Are cyclic objects ok?
e.g., can 'let x=1:x in x' be sent?
If it wasn't for mutable objects, foreign types and foreign
functions, copying objects would be a straightforward but
tedious issue. Some of these issues can be avoided if objects
are evaluated first because then the type can be used to determine
whether mutable objects or foreign types are reachable.
\ x -> lookup x table
As well as referring to 'table', this also refers to a standard
library function. We can either copy 'lookup' over or we can
use the function of the same name on the receiver.
Copying the function over can get expensive because lots of
Haskell libraries depend on other Haskell libraries and
especially if we end up copying it over multiple times.
Using the function with the same name runs the risk that the name
is the same but the function itself is different.
Foreign interfaces are a problem here as well. Is the 'fopen' function
on Linux the same as the 'fopen' function in FreeBSD or Windows?
Avoiding the overhead of sharing functions more than once is
probably easy enough. Create a signature for each function by
hashing all the code in the function and in every function and
object it depends on. Don't transmit functions which the receiver
already has. If storing the function to a file, have the store
operation omit functions listed in a table of signatures. e.g., you
might provide signatures for all the standard libraries in ghc 6.02
on windows98. (Since most GHC operations are the same on all
platforms and in all recent releases, this ought to match many
functions in ghc 5.04.)
foo :: T -> T
If the receiving program contains a type 'T' and the sending
program contains a type 'T', we might have to check that they
are the same type so we have to choose an appropriate equality
test for types. Is structural equality enough or too strong
or do we also want to insist that any data hiding property the
original program had is preserved?
A small corner of this question concerns portability of type
representations between architectures with different word sizes
and different endianness.
Good answers probably exist for the type sharing problem
(it's a well-studied area). The main problem will be picking
among the different choices.
--
Alastair Reid www.haskell-consulting.com
More information about the Glasgow-haskell-users
mailing list