[Haskell-cafe] regex-pcre is not working with UTF-8

José Romildo Malaquias j.romildo at gmail.com
Tue Aug 21 23:00:06 CEST 2012


On Tue, Aug 21, 2012 at 10:25:53PM +0300, Konstantin Litvinenko wrote:
> On 08/18/2012 06:16 PM, José Romildo Malaquias wrote:
> > Hello.
> >
> > It seems that the regex-pcre has a bug dealing with utf-8:
> >
> > I hope this bug can be fixed soon.
> >
> > Is there a bug tracker to report the bug? If so, what is it?
> >
> You need something like that
> 
> let pat = makeRegexOpts (compUTF8 .|. defaultCompOpt) defaultExecOpt 
> ("@'(.+?)'@" :: B.ByteString)
> 
> and than pat will match correctly.

The bug is related to String (not ByteString) in a UTF-8 locale.

Until it is fixed, I am using the workaround of converting the regular
expression and the text to ByteString, doing the matching, and then
converting the results back to String.

Romildo



More information about the Haskell-Cafe mailing list