unix-bytestring v0.2.0 (Was: Re: Proposal: add ByteString support to unix:System.Posix.IO API)
wren ng thornton
wren at freegeek.org
Mon Mar 7 23:52:44 CET 2011
On 3/6/11 12:22 PM, Bas van Dijk wrote:
> On 6 March 2011 13:16, wren ng thornton<wren at freegeek.org> wrote:
>> One thing I considered was to offer both versions, one for compatibility
>> with the old string versions (and for decluttering client code) and then
>> one that just gives the bytestring. Of course, then the issue is what to
>> name them...
>> So the big question is: how minimal should we be, vs how much in the way of
>> convenience functions should we offer?
> It would be great if you could analyse the direct reverse dependencies
> of unix to see how much fdRead is used and how much work it is to
> adapt packages to an fdRead which only returns the ByteString:
What an excellent idea!
It looks like unix has about 234 direct revdeps. Of which, some hasty
grepping detects only 26 files which may call fdRead or fdWrite. Namely:
Of these, many are false positives due to some local function called
fdReady (and ignored hereafter), many are false positives due to local
definitions of fdRead, myfdRead, etc. (discussed later), and only about
half a dozen appear to be direct calls to the unix package version.
Of these half dozen files, most of the use sites completely ignore the
ByteCount, two or three use it only to check (==0) which can be
efficiently detected by either String or ByteString's null predicate,
and only two of them appear to use it in any nontrivial way (e.g.,
returning it from the current function, or printing it).
Thus, I conclude, nobody wants the ByteCount. The switch from String to
ByteString would involve far more work than correcting literally a
couple pattern matches per affected project. And less than half a dozen
use sites across all of Hackage might consider calling BS.length to get
at the information.
Of the local definitions there were two classes I noticed.
* The HFuse project defines their own bindings to the pread(2) and
pwrite(2) functions, thus answering my question about whether anyone'd
want them. They use them as handling ByteString buffers too, no less!
* The liboleg, iteratee, and iteratee-mtl packages all have copies of
their own versions of fdRead (apparently a cut&paste job). In the
documentation they level numerous complaints against the unix package's
version of fdRead. The two notable ones are
(1) that fdRead allocates a new buffer every time it's called, this
seems to be addressed by fdReadBuf but the packages haven't been updated
to use it; and
(2) that fdRead throws errors (at all, let alone on EOF). They'd prefer
-> Ptr CChar
-> IO (Either Errno ByteCount)
Considering that fdReadBuf already addresses the first complaint, it
seems that it might be worthwhile to provide a safe version of fdReadBuf
which captures the error in an Either rather than throwing it (should be
cheaper than throwing it and having a wrapper catch it and convert it
into Errno?). Similarly, it may be worthwhile to have a version of
fdRead which doesn't throw an exception on EOF, but just returns the
empty ByteString instead.
More information about the Libraries