[Haskell] select(2) or poll(2)-like function?
Simon Marlow
marlowsd at gmail.com
Mon Apr 18 14:10:29 CEST 2011
On 18/04/2011 12:55, Mike Meyer wrote:
> On Mon, 18 Apr 2011 12:56:39 +0200
> Ertugrul Soeylemez<es at ertes.de> wrote:
>> Mike Meyer<mwm at mired.org> wrote:
>>> The unix process model works quite well. Compared to a threaded model,
>>> this is more robust (if a process breaks, you can kill and restart it
>>> without affecting other processes, whereas if a thread breaks,
>>> restarting the process and all the threads in it is the only safe
>>> option) and scalable (you're already doing ipc, so moving processes
>>> onto more systems is easy, and trivial if you design for it). The
>>> events handled by a single process are simple enough that your
>>> callback/event spaghetti can line up in nice, straight strands.
>> When writing concurrent code you don't care about how the RTS maps it to
>> processes and threads. GHC chose threads, probably because they are
>> faster to create/kill and consume less memory. But this is an
>> implementation detail the Haskell developer should not have to worry
>> about.
>
> So - what happens when a thread fails for some reason? I'm used to
> dealing with systems that run 7x24 for weeks or even months on
> end. Hardware hiccups, network failures, bogus input, hung clients,
> etc. are all just facts of life. I need the system to keep running
> properly in the face of all those, and I need them to disrupt the
> world as little as possible.
>
> Given that the RTS has taken control over this stuff, I sort of expect
> it to take care of noticing a dead process and restarting it as
> well. All of which is fine by me.
The RTS can't manage things at that level, because it doesn't know what
robustness model you want. So failures in the I/O library results in
exceptions, and you get to decide what to do. If a thread dies due to
an exception, then you are responsible for what happens from then on -
typically you would have a top-level exception handler that notifies
some higher-level thread what happened. It's true that Haskell doesn't
give you as much help here as you would get in Erlang/OTP, but it's all
readily programmed up.
Haskell *does* give you some important guarantees though. Threads never
just die without receiving an exception first. If a thread blocks on an
unreachable resource then it gets an exception, so you get some help
dealing with deadlocks.
>>>> We don't need to do this. We can keep a concurrent programming model
>>>> and get the execution efficiency of an event driven model. This is
>>>> what GHC's I/O manager achieves. On top of that we also get
>>>> parallelism for free. Another way to look at it is that GHC provides
>>>> the scheduler (using a thread for the event loop and a separate
>>>> worker pool) that you end up writing manually in event driven
>>>> frameworks.
>>>
>>> So my question is - can I still get the robustness/scalability
>>> features I get from the unix process model using haskell? In
>>> particular, it seems like ghc starts threads I don't ask it to, and
>>> using both threads& forks for parallelism causes even more headaches
>>> than concurrency (at least on unix& unix-like systems), so just
>>> replicating the process model won't work well. Do any of the haskell
>>> parallel processing tools work across multiple systems?
>>
>> Effectively no (unless you want to use the terribly outdated GPH
>> project), but that's a shortcoming of the current RTS, not of the design
>> patterns you use in Haskell. By design Haskell programs are well suited
>> for an auto-distributing RTS. It's just that no such RTS exists for
>> recent versions of the common compilers.
>
> So is anyone working on such a package for haskell? I know clojure's
> got some people working on making STM work in a distributed
> environment, but that's outside the goals of the core team.
Take a look at "Haskell for the Cloud", Jeff Epstein, Andrew Black and
Simon Petyon Jones:
http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf
>> In other words: Robustness and scalability should not be your business
>> in Haskell. You should concentrate on understanding and using the
>> concurrency concept well. And just to encourage you: I write
>> productive concurrent servers in Haskell, which scale very well and
>> probably better than an equivalent C implementation would. Reason: A
>> Haskell thread is not mapped to an operating system thread (unless you
>> used forkOS). When it is advantageous, the RTS can well decide to let
>> another OS thread continue a running Haskell thread. That way the
>> active OS threads are always utilized as efficiently as possible. It
>> would be a pain to get something like that with explicit threading and
>> even more, when using processes.
>
> Well, *someone* has to worry about robustness and scalability. Users
> notice when their two minute system builds start taking four minutes
> (and will be at my door wanting me to fix it) because something didn't
> scale fast enough, or have to be run more than once because a failing
> component build wasn't restarted properly. I'm willing to believe that
> haskell lets you write more scalable code than C, but C's tools for
> handling concurrency suck, so that should be true in any language
> where someone actually thought about dealing with concurrency beyond
> locks and protected methods. The problem is, the only language I've
> found where that's true that *also* has reasonable tools to deal with
> scaling beyond a single system is Eiffel (which apparently abstracts
> things even further than haskell - details like how concurrency is
> achieved or how many concurrent operations you can have are configured
> when you start an application, *not* when writing it). Unfortunately,
> Eiffel has other problems that make it undesirable.
I'm interested in understanding what problems you're referring to. What
kind of scaling are you interested in - number of clients, number of
cores, or something else? What is it about Haskell threads that you are
worried might not scale?
Cheers,
Simon
More information about the Haskell
mailing list