[Haskell-cafe] Distributing Haskell on a cluster
tifonzafel at gmail.com
Mon Mar 16 00:54:54 UTC 2015
I haven't considered that idea, but it seems the natural solution.
On 15 March 2015 at 20:31, Ozgun Ataman <ozgun.ataman at soostone.com> wrote:
> Anecdotal support for this idea: This is exactly how we distribute
> hadron-based Hadoop MapReduce programs to cluster nodes at work. The
> compiled executable essentially ships itself to the nodes and recognizes
> the different environment when executed in that context.
>  hadron is a haskell hadoop streaming framework that came out of our
> work. It's on github and close to being released on hackage once the
> current dev branch is finalized/merged. In case it's helpful:
> On Mar 15, 2015, at 8:06 PM, Andrew Cowie <andrew at operationaldynamics.com>
> Bit of a whinger from left-field, but rather than deploying a Main script
> and then using GHCi, have you considered compiling the program and shipping
> Before you veto the idea out of hand, statically compiled binaries are
> good for being almost self-contained, and (depending on what you changed)
> and they rsync well. And if that doesn't appeal, then consider instead
> building the Haskell program dynamically; Hello World is only a couple kB;
> serious program only a hundred or so.
> Anyway, I know you're just looking to send a code fragment closure, but if
> you're dealing with the input and output of the program through a stable
> interface, then the program is the closure.
> Just a thought.
> On Mon, Mar 16, 2015 at 9:53 AM felipe zapata <tifonzafel at gmail.com>
>> Hi all,
>> I have posted the following question on stackoverflow, but so far I have
>> not received an answer.
>> I have a piece of code that process files,
>> processFiles :: [FilePath] -> (FilePath -> IO ()) -> IO ()
>> This function spawns an async process that execute an IO action. This IO
>> action must be submitted to a cluster through a job scheduling system (e.g
>> Because I must use the job scheduling system, it's not possible to use
>> cloudHaskell to distribute the closure. Instead the program writes a new
>> *Main.hs* containing the desired computations, that is copy to the
>> cluster node together with all the modules that main depends on and then it
>> is executed remotely with "runhaskell Main.hs [opts]". Then the async
>> process should ask periodically to the job scheduling system (using
>> *threadDelay*) if the job is done.
>> Is there a way to avoid creating a new Main? Can I serialize the IO
>> action and execute it somehow in the node?
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe