[Haskell-cafe] Re: Emulating bash pipe/ process lib

Wed Feb 22 10:14:55 EST 2006

Hello Simon,

Tuesday, February 21, 2006, 4:05:57 PM, you wrote:

>> i'm not very interested to do something fascinating in this area. it
>> seems that it is enough to do
>> 
>> 1) non-blocking read of the entire buffer on input
>> 2) flush buffer at each '\n' at output
>> 
>> that should be enough to implement LineBuffering for everyone except
>> purists? and for the NoBuffering the same except for flushing after
>> each output operation?

SM> Yes, exactly.  This is almost what GHC's System.IO currently does, 
SM> except that for NoBuffering we have a fixed buffer size of 1 byte.  It
SM> would be safe to have a larger buffer size for NoBuffering read handles,
SM> but I didn't recognise that when I wrote it.

btw, this makes NoBuffering mode unusable for some tasks. Donn Cave
wrote about this here:

...
> when i think how to implementat LineBuffering, i decided that it is
> the only possible way - read byte a time and see for a '\n'. i don't
> know how System.IO implemented but i think that it should do the same

Don't know - I see that Simon M followed up with an explanation that
I think confirms my impression that LineBuffering is indistinguishable
from BlockBuffering, for input.  I assume it's only there for the sake
of output, where it does make a difference.

Only NoBuffering is interoperable with select.

> DC> Since POSIX read(2) already supports exactly the functions you need for
> DC> unbuffered I/O, it's simpler, easier and more efficient to leave the whole
> DC> business right there at the file descriptor level.
> 
> can you please describe that read(2) does that is better than reading
> char-at-a-time?

It returns whatever available data, as long as it's more than 0 bytes
and less than the caller-supplied limit.  This is the only read operation
that works with select (including char-at-a-time, as a special case
where the caller-supplied limit is 1.)

> DC> I'm sure you can make
> DC> a non-buffering buffer layer work on top of the file descriptor, but what
> DC> makes it worth the trouble?
> 
> if you don't have I/O library that implements what you need, it is
> indeed simpler to use lower I/O directly. if you have I/O library that
> does that you need, it is easier to write:
> 
> (hIn, hOut) <- createUnixPipe
> vPutStrLn hOut "hello"
> s <- vGetLine hIn
> 
> i'm writing such lib now, so i'm interested to know what i need to
> do so that it will work ok.

It won't!  I mean, we can use it the same way as the ordinary Handle in
the original example, but we know in principle, if you call vGetLine,
it may block regardless of whether select reports input data, because
select can't tell you whether there's a full line of input.

So you don't have anything to worry about here - this is not your problem.
I only wanted to point out that for select-based I/O event multiplexing,
we will continue to need file descriptors and system level POSIX I/O,
and that the need for this can occur in such ordinary, mundane applications
as reading stdout and stderr in parallel.

        Donn Cave, donn at drizzle.com

-- 
Best regards,
 Bulat                            mailto:Bulat.Ziganshin at gmail.com