[Haskell-cafe] regex-pcre is not working with UTF-8
José Romildo Malaquias
j.romildo at gmail.com
Tue Aug 21 23:00:06 CEST 2012
On Tue, Aug 21, 2012 at 10:25:53PM +0300, Konstantin Litvinenko wrote:
> On 08/18/2012 06:16 PM, José Romildo Malaquias wrote:
> > Hello.
> > It seems that the regex-pcre has a bug dealing with utf-8:
> > I hope this bug can be fixed soon.
> > Is there a bug tracker to report the bug? If so, what is it?
> You need something like that
> let pat = makeRegexOpts (compUTF8 .|. defaultCompOpt) defaultExecOpt
> ("@'(.+?)'@" :: B.ByteString)
> and than pat will match correctly.
The bug is related to String (not ByteString) in a UTF-8 locale.
Until it is fixed, I am using the workaround of converting the regular
expression and the text to ByteString, doing the matching, and then
converting the results back to String.
More information about the Haskell-Cafe