[Haskell-cafe] Distributed haskell using Hadoop

Tue Oct 16 12:38:08 EDT 2007

On 10/16/07, Brad Clow <brad at bjclow.org> wrote:
> I would prefer a more Haskell orientated solution and welcome any
> suggestions. If not maybe this will be of use to others.

Well, Hadoop is aiming towards a Google style of cluster processing
and the path towards that is pretty clear:

1) An XDR like serialisation scheme with support for backwards
compatibility (which involves unique-for-all-time ids in the IDL and
"required", "optional" etc tag). Data.Binary would be a great start
for this, but it's sadly lazy in parsing and they never applied my
patch for optional strictness so one would probably have to start from
scratch.

2) An RPC system which handles the most common use case: arguments and
replies are serialised using the above system, TCP transport, simple
timeouts, STM for concurrency.

Then you can start doing cool stuff like using the GHC API for code
motion and building a simple MapReduce like framework etc.

-- 
Adam Langley                                      agl at imperialviolet.org
http://www.imperialviolet.org                       650-283-9641