[Haskell-cafe] bug in Prelude.words?
Christopher Done
chrisdone at googlemail.com
Mon Mar 28 18:05:47 CEST 2011
On 28 March 2011 17:55, malcolm.wallace <malcolm.wallace at me.com> wrote:
> Does anyone else think it odd that Prelude.words will break a string at a
> non-breaking space?
>
> Prelude> words "abc def\xA0ghi"
> ["abc","def","ghi"]
>
I think it's predictable, isSpace (which words is based on) is based on
generalCategory, which returns the proper Unicode category:
λ> generalCategory '\xa0'
Space
So:
-- | Selects white-space characters in the Latin-1 range.-- (In
Unicode terms, this includes spaces and some control
characters.)isSpace :: Char -> Bool-- isSpace includes
non-breaking space-- Done with explicit equalities both for
efficiency, and to avoid a tiresome-- recursion with GHC.List
elemisSpace c = c == ' ' ||
c == '\t' || c == '\n' ||
c == '\r' || c == '\f'
|| c == '\v' ||
c == '\xa0' || iswspace (fromIntegral (ord
c)) /= 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20110328/9d73f9f1/attachment.htm>
More information about the Haskell-Cafe
mailing list