[Haskell-cafe] newbie questions (read, etc., with Data.ByteString.Lazy.Char8)

Mon Oct 6 22:35:15 EDT 2008

Am Dienstag, 7. Oktober 2008 04:21 schrieb Jason Dagit:
> On Mon, Oct 6, 2008 at 7:06 PM, Mike Coleman <tutufan at gmail.com> wrote:
> > Hi,
> >
> > I could use a little help.  I was looking through the Real World
> > Haskell book and came across a trivial program for summing numbers in
> > a file.  They mentioned that that implementation was very slow, as
> > it's based on String's, so I thought I'd try my hand at converting it
> > to use lazy ByteString's.  I've made some progress, but now I'm a
> > little stuck because that module doesn't seem to have a 'read' method.
> >
> > There's a readInt method, which I guess I could use, but it returns a
> > Maybe, and I don't see how I can easily strip that off.
> >
> > So:
> >
> > 1.  Is there an easy way to strip off the Maybe that would allow an
> > equivalently concise definition for sumFile?  I can probably figure
> > out how to do it with pattern matching and a separate function--I'm
> > just wondering if there's a more concise way.
>
> I'm not a ByteString expert, but there should be an easy way to solve this
> issue of Maybe.
>
> If you go to hoogle (http://www.haskell.org/hoogle/) and type this:
> [Maybe a] -> [a]
> it says:
> Data.Maybe<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Mayb
>e.html>
> .catMaybes<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Mayb
>e.html#v:catMaybes>:: [Maybe
> a] ->
> [a]<http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Maybe.html#
>v:catMaybes>
>
> As the top search result.
>
> This means that you can convert any list of maybes into a list of what you
> want.  It just tosses out the Nothings.

Since readInt returns a Maybe (Int,ByteString), Data.List.unfoldr would be a 
better fit for his needs.

The bytestring-lexing package 
(http://hackage.haskell.org/packages/archive/bytestring-lexing/0.1.2/doc/html/Data-ByteString-Lex-Double.html) 
provides readDouble, which is also pretty fast, I think.

>
> > 2.  Why doesn't ByteString implement 'read'?  Is it just that this
> > function (like 'input' in Python) isn't really very useful for real
> > programs?
>
> I think probably for things more complex than parsing ints it's best to
> make your own parser?  I seem to recall that someone was working on a
> library of parsing functions based on bytestring?  Maybe someone else can
> comment?

At least parsec 3.0.0 has ByteString parsing modules (I've never used it, so I 
don't know how well they work).
IIRC, there's a plan to expand the bytestring-lexing package, too.

>
> 3.  Why doesn't ByteString implement 'readDouble', etc.?  That is, why
>
> > are Int and Integer treated specially?  Do I not need readDouble?
>
> I think readInt was mostly implemented because integer reading was needed a
> lot for benchmarks and programming challenge sites and people noticed it
> was way slower than needed so someone put in the effort it optimize it. 
> Once it was optimized, that must have satisfied the need for fast number
> reading.

More's underway, readDouble already delivered.
>
> I would agree that at least for Prelude types it would be nice to have
> efficient bytestring based parsers.  Do we have Read/Show classes
> specifically for working in bytestrings?  Maybe that's what the world needs
> in the bytestring api.  Then again, I'm not really qualified to comment :)
> For all I know it already exists.

partially.
>
> Jason