[Haskell-cafe] Conduit experiment: Is this correct?
es at ertes.de
Fri Feb 3 18:52:33 CET 2012
Michael Snoyman <michael at snoyman.com> wrote:
> In this particular case, it will work due to the implementation of
> snk. In general, however, you're correct: you should not use the same
> sink twice.
> I haven't thought about it much yet, but my initial recommendation
> would be to create a new Conduit using SequencedSink, which takes the
> three lines and then switches over to a passthrough conduit. The
> result looks like this:
I think I'm getting the conduit stuff, at least on a high level. As a
little exercise I have ported a simplified variant of the 'netlines'
enumerator to the conduit library. This is the code:
import qualified Data.ByteString as B
netLine :: (Resource m) => Int -> Sink B.ByteString m B.ByteString
netLine n0 = sinkState (n0, B.empty) push (return . snd)
push (n, str') dstr' =
case B.elemIndex 10 dstr' of
let dstr = B.take n dstr'
str = B.append str' dstr
in str `seq` StateProcessing (n - B.length dstr, str)
Just i ->
let (pfx, sfx) = B.splitAt i dstr'
str = B.append str' (B.take n pfx)
in str `seq` StateDone (Just . B.copy $ B.tail sfx) str
netLines :: (Resource m) => Int -> Conduit B.ByteString m B.ByteString
netLines n = sequenceSink () (\s -> fmap (\ln -> Emit s [ln]) (netLine n))
It reads a 256 MiB file with random data in 1.3 seconds and runs in
constant memory for infinite lines. This is reassuring.
But anyway, is this the proper/idiomatic way to do it, or would you go
for a different direction?
Key-ID: E5DD8D11 "Ertugrul Soeylemez <es at ertes.de>"
FPrint: BD28 3E3F BE63 BADD 4157 9134 D56A 37FA E5DD 8D11
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 836 bytes
Desc: not available
More information about the Haskell-Cafe