[Haskell] ANN: polyparse-1.00

Malcolm Wallace Malcolm.Wallace at cs.york.ac.uk
Tue Jan 23 12:24:43 EST 2007


		polyparse-1.00
		--------------
Announcing:
	http://www.cs.york.ac.uk/fp/polyparse

PolyParse is a collection of parser combinator libraries in Haskell.
They were all previously distributed as part of HaXml, but are now split
out to make them more widely available.

You are likely to use only one of the included modules at any one time -
they are generally alternatives to each other, as well as an alternative
to other widely-used parser libraries available elsewhere.

  * Text.Parse. The Text.Read class from Haskell'98 is widely recognised
  to have many problems.  It is inefficient.  If a read fails, there is no
  indication of why.  Worst of all, a read failure crashes your whole
  program!  Text.Parse is a proposed replacement for the Read class.  It
  defines a new class, Parse, with methods that return an explicit
  notification of errors, through the Either type.  It also defines a
  number of useful helper functions to enable the construction of
  parsers for textual representations of Haskell data structures, e.g.
  named fields.  Unsurprisingly, Text.Parse is really just a
  specialisation of the Poly combinators for String input, and the
  entire Poly API is also re-exported.  The DrIFT tool can derive
  instances of the Parse class for you automatically.  (Use the syntax
  {-! derive : Parse !-})

  * Text.ParserCombinators.HuttonMeijer.  The most venerable of all
  monadic parser combinator libraries, this version dates from 1996.
  Originally distributed with Gofer, then Hugs, as ParseLib.  It uses the
  idea of "failure as a list of successes" to give multiple possible
  parses through backtracking.  (But in practice, almost nobody wants any
  parse except the first complete one.)

  * Text.ParserCombinators.HuttonMeijerWallace.  The Hutton/Meijer
  combinators, extended to take an arbitrary token type as input (not
  just characters), plus a running state (e.g. to collect a symbol
  table, or macros), plus some facilities for simple error-reporting.

  * Text.ParserCombinators.Poly.  The name Poly comes from the arbitrary
  token type.  Thus, you can write your own lexer if you wish, rather
  than needing to encode lexical analysis within the parser itself.  This
  is a fresh set of combinators, improving on the HuttonMeijer variety
  by keeping only a single success, not a list of them.  This is more
  space-efficient, whilst still permitting backtracking.  Error-handling
  is also much improved: there are essentially two kinds of failure,
  soft and hard.  Soft failure just means that the current parse did not
  work out, but another parse might be OK.  Hard failure means that no
  parse will succeed, because we have already passed a point of
  commitment.  Thus you can give far more accurate error messages to the
  user, including multi-layered locations.

  * Text.ParserCombinators.PolyState is just like Poly, except it adds
  an arbitrary running state parameter.

  * Text.ParserCombinators.PolyLazy is just like Poly, except it does
  not return explicit failures.  Instead, an exception is raised.  Thus,
  it can return a partial parse result, before a full parse is complete.
  The word partial indicates that, having committed to return some outer
  data constructor, we might later discover some parse error further
  inside, so the value will be partial, as in incomplete: containing
  bottom.  However, if you are confident that the input is error-free,
  then you will gain hugely in space-efficiency - essentially you can
  stream-process your parsed data-structure within very small constant
  space.  This is especially useful for large structures like e.g. XML
  trees.

  * Text.ParserCombinators.PolyStateLazy combines PolyState and PolyLazy.

All the Poly* variations share the same basic API, so it is easy to
switch from one set to another, when you discover you need an extra
facility, just by changing a single import.

Regards,
    Malcolm


More information about the Haskell mailing list