[Haskell-cafe] Standards-conform CSS tokenizer

Tomas Carnecky tomas.carnecky at gmail.com
Thu Aug 13 19:21:17 UTC 2015

Recently I wrote a CSS tokenizer based on the css-syntax spec (the same
standard that Blink now uses to parse CSS). The detailed description of my
motivation is in the readme, but if I had to describe it in a sentence: The
existing Haskell parsers were too high-level for me needs, they provided
too much abstraction.

The package does not implement the full parser, at least one token type is
not implemented. But it should be able to tokenize most CSS files. I copied
the tests from the Blink source. All but one pass (and of course the tests
for the token types that are not implemented)

While the spec describes the algorithm in a way that pretty much prescribes
the implementation, I chose to use attoparsec instead. Not sure about the
speed implications, but speed is not that important for me.

The spec module also describes a parser. I did not have time to implement
it, and am currently not planning to. But it would be nice if that was part
of the package as well.

I had hoped to be able to write a minimizer as well, but I'm not sure if
it's possible to write a minimizer just on the token stream, maybe the full
parser is needed for that. However, consecutive whitespace is collapsed so
tokenizing and then serializing the file will in most cases lead to a
smaller file.

Hackage: http://hackage.haskell.org/package/css-syntax
GitHub repo: https://github.com/wereHamster/haskell-css-syntax
Spec: https://drafts.csswg.org/css-syntax
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20150813/ce0bc29b/attachment.html>

More information about the Haskell-Cafe mailing list