[Haskell-cafe] A round of golf

Creighton Hogg wchogg at gmail.com
Thu Sep 18 15:31:00 EDT 2008


On Thu, Sep 18, 2008 at 1:55 PM, Don Stewart <dons at galois.com> wrote:
> wchogg:
>> On Thu, Sep 18, 2008 at 1:29 PM, Don Stewart <dons at galois.com> wrote:
<snip>
>> > This makes me cry.
>> >
>> >    import System.Environment
>> >    import qualified Data.ByteString.Lazy.Char8 as B
>> >
>> >    main = do
>> >        [f] <- getArgs
>> >        s   <- B.readFile f
>> >        print (B.count '\n' s)
>> >
>> > Compile it.
>> >
>> >    $ ghc -O2 --make A.hs
>> >
>> >    $ time ./A /usr/share/dict/words
>> >    52848
>> >    ./A /usr/share/dict/words 0.00s user 0.00s system 93% cpu 0.007 total
>> >
>> > Against standard tools:
>> >
>> >    $ time wc -l /usr/share/dict/words
>> >    52848 /usr/share/dict/words
>> >    wc -l /usr/share/dict/words 0.01s user 0.00s system 88% cpu 0.008 total
>>
>> So both you & Bryan do essentially the same thing and of course both
>> versions are far better than mine.  So the purpose of using the Lazy
>> version of ByteString was so that the file is only incrementally
>> loaded by readFile as count is processing?
>
> Yep, that's right
>
> The streaming nature is implicit in the lazy bytestring. It's kind of
> the dual of explicit chunkwise control -- chunk processing reified into
> the data structure.

To ask an overly general question, if lazy bytestring makes a nice
provider for incremental processing are there reasons to _not_ reach
for that as my default when processing large files?


More information about the Haskell-Cafe mailing list