[Haskell] select(2) or poll(2)-like function?

Mon Apr 18 17:46:33 CEST 2011

Mike Meyer <mwm at mired.org> wrote:

> > 	To add a bit more. The most common use of select/epoll is to
> > simulate the concurrency because the natural way of doing it
> > fork/pthread_create etc are too expensive. I dont know of any other
> > reason why select/epoll exits.
>
> You know, I've *never* written a select/kqueue loop because
> fork/pthread/etc. were to expensive (and I can remember when fork was
> cheap). I always use it because the languages I'm working in have
> sucky tools for dealing with concurrency, so it's easier just to avoid
> the problems by not writing concurrent code.

Have you ever written concurrent code in Haskell?  Because ...

> > If fork was trivial in terms of overhead, then one would rather
> > write a webserver as
> >
> > forever do
> > 	accept the next connection
> > 	handle the request in the new child thread/process
>
> Only if you also made the TCP/IP connection overhead trivial so you
> could stop with HTTP/1.0 and not deal with HTTP/1.1. Failing that, the
> most natural way to do this is:
>
> forever do
> 	accept the next connection
> 	handle requests from connection in new child
> 	       wait for next events
> 	       if one is a client request, start the response
> 	       if one is a finished response, return it to the client
> 	       if one is something else, something broke, deal with it.
>
> I.e, an event-driven loop for each incoming connection running in it's
> own process.

... it seems that you have a completely wrong impression of cheap
concurrency.  You are still connecting Haskell threads to operating
system threads and resources somehow.  It's called "cheap concurrency"
for a good reason.  There is nothing wrong with creating tens or even
hundreds of threads per client, even when you have hundreds of clients
at the same time.

Please don't think of Haskell threads as some concrete memory/execution
object, because they are really not.  They are a design pattern and the
resulting code will be the ordinary threaded epolled code, just like you
would write it without concurrency, and likely even better, as I noted
in the other reply.

> > This is because it is a natural ``lift'' of the client handling code
> > to many clients (While coding the handling code one need not worry
> > about the other threads).
>
> Still true - at least if you don't try and create a thread for each
> request on a connection. If you do that, then the threads on a
> connection have to worry about each other. Which is why the event loop
> for the second stage is more natural than creating more threads.

In Haskell there are many very easy to use communication constructs.
You will like MVars and STM for this.  Done properly concurrency with
communication constructs are easier to use than event-driven client
management.

> > GHC's runtime with forkIO makes this natural server code
> > efficient. It might use epoll/kqueue/black magic/sale of souls to
> > the devil I don't care.
>
> But doesn't remove the need for some kind of event handling tool in
> each thread.

If you want to call STM event handling, you can, but I think that this
interpretation doesn't really fit.  Event handling involves
polling/waiting.  STM does not.

For example one perfectly fine application is this:  There is a variable
holding a list of all clients and one thread scans this list in an
infinite loop trying to find clients, which have a certain bit set.  For
each client found, the thread performs some reaction to this and resets
the bit.  There is no waiting.  The thread really runs an infinite loop.
Now one might think that this will waste CPU cycles.  But it does not,
because it's STM.

Greets,
Ertugrul

-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/