[Haskell-cafe] A round of golf

Don Stewart dons at galois.com
Thu Sep 18 14:55:10 EDT 2008


wchogg:
> On Thu, Sep 18, 2008 at 1:29 PM, Don Stewart <dons at galois.com> wrote:
> > wchogg:
> >> Hey Haskell,
> >> So for a fairly inane reason, I ended up taking a couple of minutes
> >> and writing a program that would spit out, to the console, the number
> >> of lines in a file.  Off the top of my head, I came up with this which
> >> worked fine with files that had 100k lines:
> >>
> >> main = do
> >>  path <- liftM head $ getArgs
> >>  h <- openFile path ReadMode
> >>  n <- execStateT (countLines h) 0
> >>  print n
> >>
> >> untilM :: Monad m => (a -> m Bool) -> (a -> m ()) -> a -> m ()
> >> untilM cond action val = do
> >>  truthy <- cond val
> >>  if truthy then return () else action val >> (untilM cond action val)
> >>
> >> countLines :: Handle -> StateT Int IO ()
> >> countLines = untilM (\h -> lift $ hIsEOF h) (\h -> do
> >>                                                 lift $ hGetLine h
> >>                                                 modify (+1))
> >>
> >> If this makes anyone cringe or cry "you're doing it wrong", I'd
> >> actually like to hear it.  I never really share my projects, so I
> >> don't know how idiosyncratic my style is.
> >
> > This makes me cry.
> >
> >    import System.Environment
> >    import qualified Data.ByteString.Lazy.Char8 as B
> >
> >    main = do
> >        [f] <- getArgs
> >        s   <- B.readFile f
> >        print (B.count '\n' s)
> >
> > Compile it.
> >
> >    $ ghc -O2 --make A.hs
> >
> >    $ time ./A /usr/share/dict/words
> >    52848
> >    ./A /usr/share/dict/words 0.00s user 0.00s system 93% cpu 0.007 total
> >
> > Against standard tools:
> >
> >    $ time wc -l /usr/share/dict/words
> >    52848 /usr/share/dict/words
> >    wc -l /usr/share/dict/words 0.01s user 0.00s system 88% cpu 0.008 total
> 
> So both you & Bryan do essentially the same thing and of course both
> versions are far better than mine.  So the purpose of using the Lazy
> version of ByteString was so that the file is only incrementally
> loaded by readFile as count is processing?

Yep, that's right

The streaming nature is implicit in the lazy bytestring. It's kind of
the dual of explicit chunkwise control -- chunk processing reified into
the data structure.

-- Don 


More information about the Haskell-Cafe mailing list