[Haskell-cafe] Attoparsec.ByteString.Char8 or Attoparsec.ByteString for diff output?
Viktor Dukhovni
ietf-dane at dukhovni.org
Fri Feb 17 18:08:31 UTC 2023
On Fri, Feb 17, 2023 at 01:32:48PM -0400, Pedro B. wrote:
> I am developing a program to parse dif output taken from stdin (as in
> diff file1 file2 | myApp) or from a file. I am reading the input as
> ByteString in either case and I am parsing it Attoparsec. My question
> is, Should I use Data.Attoparsec.ByteString.Char8 or
> Data.Attoparsec.ByteString?
>
> So far, I've been using Data.Attoparsec.ByteString.Char8 and it works
> for my sample files, which are in utf8 or, latin1, or the default
> Windows encoding.
>
> What do you suggest?
Because the underlying ByteString data type is the same:
Data.ByteString ~ Data.ByteString.Char8
you can use either or both sets of combinators as you see fit. The
Char8 combinators match the parsed ByteStrings against Char predicates,
while the base ByteString combinators match against Word8 predicates.
The below is valid:
import Data.Attoparsec.ByteString as A8
import Data.Attoparsec.ByteString.Char8 as AC
...
myParser :: ...
myparser ... = do
...
-- parse a Word8 byte followed by an 8-bit Char
w <- A8.anyWord8
c <- AC.anyChar
...
--
Viktor.
More information about the Haskell-Cafe
mailing list