[Haskell-cafe] Regular Expression with PCRE

Brandon Allbery allbery.b at gmail.com
Sat Mar 17 11:51:39 CET 2012


On Fri, Mar 16, 2012 at 20:17, Carter Tazio Schonwald <
carter.schonwald at gmail.com> wrote:

> Basically this applies in your case because recognizing if a sequence of
> characters is in a comment block or not for HTML is likely not expressible
> using regexes.
>
> There may be a way for a very controlled restricted subset of HTML, but it
> might require some complex regexes.
>

Comments in particular are one of the places where SGML said one thing, the
HTML spec which was loosely derived from SGML said a different thing, and
most browsers did (occasionally mutually incompatible) something not quite
either, with the result that they can be *very* difficult to get right in
the general case.

HTML is not at all easy to deal with.

-- 
brandon s allbery                                      allbery.b at gmail.com
wandering unix systems administrator (available)     (412) 475-9364 vm/sms
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120317/f8598d55/attachment.htm>


More information about the Haskell-Cafe mailing list