[Haskell-cafe] Conduit experiment: Is this correct?

Ertugrul Söylemez es at ertes.de
Fri Feb 3 18:52:33 CET 2012


Michael Snoyman <michael at snoyman.com> wrote:

> In this particular case, it will work due to the implementation of
> snk. In general, however, you're correct: you should not use the same
> sink twice.
>
> I haven't thought about it much yet, but my initial recommendation
> would be to create a new Conduit using SequencedSink, which takes the
> three lines and then switches over to a passthrough conduit. The
> result looks like this:

I think I'm getting the conduit stuff, at least on a high level.  As a
little exercise I have ported a simplified variant of the 'netlines'
enumerator to the conduit library.  This is the code:

    import qualified Data.ByteString as B

    netLine :: (Resource m) => Int -> Sink B.ByteString m B.ByteString
    netLine n0 = sinkState (n0, B.empty) push (return . snd)
        where
        push (n, str') dstr' =
            return $
            case B.elemIndex 10 dstr' of
              Nothing ->
                  let dstr = B.take n dstr'
                      str  = B.append str' dstr
                  in str `seq` StateProcessing (n - B.length dstr, str)
              Just i ->
                  let (pfx, sfx) = B.splitAt i dstr'
                      str        = B.append str' (B.take n pfx)
                  in str `seq` StateDone (Just . B.copy $ B.tail sfx) str

    netLines :: (Resource m) => Int -> Conduit B.ByteString m B.ByteString
    netLines n = sequenceSink () (\s -> fmap (\ln -> Emit s [ln]) (netLine n))

It reads a 256 MiB file with random data in 1.3 seconds and runs in
constant memory for infinite lines.  This is reassuring.

But anyway, is this the proper/idiomatic way to do it, or would you go
for a different direction?


Greets,
Ertugrul

-- 
Key-ID: E5DD8D11 "Ertugrul Soeylemez <es at ertes.de>"
FPrint: BD28 3E3F BE63 BADD 4157  9134 D56A 37FA E5DD 8D11
Keysrv: hkp://subkeys.pgp.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120203/8b42a291/attachment.pgp>


More information about the Haskell-Cafe mailing list