[Haskell] select(2) or poll(2)-like function?

Mon Apr 18 14:04:51 CEST 2011

Please can this discussion be moved to haskell-cafe?

   http://www.haskell.org/haskellwiki/Mailing_Lists

Ta.
Jeremy

On 18 Apr 2011, at 12:55, Mike Meyer wrote:

> On Mon, 18 Apr 2011 12:56:39 +0200
> Ertugrul Soeylemez <es at ertes.de> wrote:
>> Mike Meyer <mwm at mired.org> wrote:
>>> On Mon, 18 Apr 2011 11:07:58 +0200
>>> Johan Tibell <johan.tibell at gmail.com> wrote:
>>>> On Mon, Apr 18, 2011 at 9:13 AM, Mike Meyer <mwm at mired.org> wrote:
>>>>> I always looked at it the other way 'round: threading is a hack to
>>>>> deal with system inadequacies like poor shared memory performance
>>>>> or an inability to get events from critical file types.
>>>>>
>>>>> Real processes and event-driven programming provide a more robust,
>>>>> understandable and scalable solutions.
>>>>> <end rant>
>>>>
>>>> We need to keep two things separate: threads as a way to achieve
>>>> concurrency and as a way to achieve parallelism [1].
>>>
>>> Absolutely. Especially because you shouldn't have to deal with
>>> concurrency if all you want is parallelism. Your reference [1]  
>>> covers
>>> why this is the case quite nicely (and is essentially the  
>>> argument for
>>> "understandable" in my claim above).
>>
>> You also don't need Emacs/Vim, if all you want is to write a simple
>> plain text file.  There is nothing wrong with concurrency, because  
>> you
>> are confusing the high level model with the low level implementation.
>> Concurrency is nothing but a design pattern, and GHC shows that a  
>> high
>> level design pattern can be mapped to efficient low level code.
>
> Possibly true. The question is - can it be mapped to a design that's
> as robust and scalable as the ones I'm used to working on?
>
>> In Haskell you should not use explicit, manual OS threading/ 
>> forking for
>> the same reason you shouldn't write machine code manually.
>
> That's a good thing - providing it doesn't compromise robustness and
> scalability.
>
>>>> It's useful to use non-determinism (i.e. concurrency) to model a
>>>> server processing multiple requests. Since requests are independent
>>>> and shouldn't impact each other we'd like to model them as
>>>> such. This implies some level of concurrency (whether using threads
>>>> and processes).
>>>
>>> But because the requests are independent, you don't need concurrency
>>> in this case - parallelism is sufficient.
>> Perhaps Haskell is the wrong language for you.  How about  
>> programming in
>> C/C++?  I think you want more control over low level resources than
>> Haskell gives you.  But I suggest having a closer look at  
>> concurrency.
>
> Personally, I don't want to have to worry about low-level resources,
> or even concurrency. Having to do so feels to much like having to
> explicitly allocate and free memory, or worry about register
> allocations. But if I have to do those things to get robustness and
> scalability until the languages start being able to deal with it, then
> I need the RTS to get out of the way and let me do my job.
>
> If I'm using a value that needs protection from concurrent access
> without providing that protection, I want the system give me an
> error. At run-time is acceptable, but compile time is better. I want
> the system to make sure the concurrent protection mechanisms work
> properly - no deadlocks, no stuck process, etc - without my having to
> do anything but indicate which values need such protection.
>
>>> The unix process model works quite well. Compared to a threaded  
>>> model,
>>> this is more robust (if a process breaks, you can kill and  
>>> restart it
>>> without affecting other processes, whereas if a thread breaks,
>>> restarting the process and all the threads in it is the only safe
>>> option) and scalable (you're already doing ipc, so moving processes
>>> onto more systems is easy, and trivial if you design for it). The
>>> events handled by a single process are simple enough that your
>>> callback/event spaghetti can line up in nice, straight strands.
>> When writing concurrent code you don't care about how the RTS maps  
>> it to
>> processes and threads.  GHC chose threads, probably because they are
>> faster to create/kill and consume less memory.  But this is an
>> implementation detail the Haskell developer should not have to worry
>> about.
>
> So - what happens when a thread fails for some reason? I'm used to
> dealing with systems that run 7x24 for weeks or even months on
> end. Hardware hiccups, network failures, bogus input, hung clients,
> etc. are all just facts of life. I need the system to keep running
> properly in the face of all those, and I need them to disrupt the
> world as little as possible.
>
> Given that the RTS has taken control over this stuff, I sort of expect
> it to take care of noticing a dead process and restarting it as
> well. All of which is fine by me.
>
>>>> We don't need to do this. We can keep a concurrent programming  
>>>> model
>>>> and get the execution efficiency of an event driven model. This is
>>>> what GHC's I/O manager achieves. On top of that we also get
>>>> parallelism for free. Another way to look at it is that GHC  
>>>> provides
>>>> the scheduler (using a thread for the event loop and a separate
>>>> worker pool) that you end up writing manually in event driven
>>>> frameworks.
>>>
>>> So my question is - can I still get the robustness/scalability
>>> features I get from the unix process model using haskell? In
>>> particular, it seems like ghc starts threads I don't ask it to, and
>>> using both threads & forks for parallelism causes even more  
>>> headaches
>>> than concurrency (at least on unix & unix-like systems), so just
>>> replicating the process model won't work well. Do any of the haskell
>>> parallel processing tools work across multiple systems?
>>
>> Effectively no (unless you want to use the terribly outdated GPH
>> project), but that's a shortcoming of the current RTS, not of the  
>> design
>> patterns you use in Haskell.  By design Haskell programs are well  
>> suited
>> for an auto-distributing RTS.  It's just that no such RTS exists for
>> recent versions of the common compilers.
>
> So is anyone working on such a package for haskell? I know clojure's
> got some people working on making STM work in a distributed
> environment, but that's outside the goals of the core team.
>
>> In other words:  Robustness and scalability should not be your  
>> business
>> in Haskell.  You should concentrate on understanding and using the
>> concurrency concept well.  And just to encourage you:  I write
>> productive concurrent servers in Haskell, which scale very well and
>> probably better than an equivalent C implementation would.   
>> Reason:  A
>> Haskell thread is not mapped to an operating system thread (unless  
>> you
>> used forkOS).  When it is advantageous, the RTS can well decide to  
>> let
>> another OS thread continue a running Haskell thread.  That way the
>> active OS threads are always utilized as efficiently as possible.  It
>> would be a pain to get something like that with explicit threading  
>> and
>> even more, when using processes.
>
> Well, *someone* has to worry about robustness and scalability. Users
> notice when their two minute system builds start taking four minutes
> (and will be at my door wanting me to fix it) because something didn't
> scale fast enough, or have to be run more than once because a failing
> component build wasn't restarted properly. I'm willing to believe that
> haskell lets you write more scalable code than C, but C's tools for
> handling concurrency suck, so that should be true in any language
> where someone actually thought about dealing with concurrency beyond
> locks and protected methods. The problem is, the only language I've
> found where that's true that *also* has reasonable tools to deal with
> scaling beyond a single system is Eiffel (which apparently abstracts
> things even further than haskell - details like how concurrency is
> achieved or how many concurrent operations you can have are configured
> when you start an application, *not* when writing it). Unfortunately,
> Eiffel has other problems that make it undesirable.
>
>> That's why the RTS lets you choose the number of OS threads only  
>> instead
>> of giving you low level control over the threads.  It spawns as many
>> threads as you ask it to spawn and manages them with its own  
>> strategy.
>> The only way to manipulate this strategy is by deciding whether a
>> particular Haskell thread is bound (forkOS) or not (forkIO).
>
> Does the programmer have to worry about such trivia as the number of
> threads to use?
>
>     <mike
> -- 
> Mike Meyer <mwm at mired.org>		http://www.mired.org/consulting.html
> Independent Software developer/SCM consultant, email for more  
> information.
>
> O< ascii ribbon campaign - stop html mail - www.asciiribbon.org
>
> _______________________________________________
> Haskell mailing list
> Haskell at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell

Jeremy.Gibbons at comlab.ox.ac.uk
   Oxford University Computing Laboratory,    TEL: +44 1865 283508
   Wolfson Building, Parks Road,              FAX: +44 1865 283531
   Oxford OX1 3QD, UK.
   URL: http://www.comlab.ox.ac.uk/oucl/people/jeremy.gibbons.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell/attachments/20110418/28cb8eb8/attachment-0001.htm>