[Haskell-cafe] HXT and xhtml page encoded in cp1251
Albert Y. C. Lai
trebla at vex.net
Wed Apr 20 00:24:47 CEST 2011
On 11-04-18 05:06 PM, Dmitry V'yal wrote:
> The readDocument arrow fails with the following message:
>
> fatal error: encoding scheme not supported: "WINDOWS-1251"
>
> Can someone suggest a workaround for my use case?
If you have a Handle (from file or Network for example),
import System.IO(hGetContents, hSetEncoding, mkTextEncoding)
import Text.XML.HXT.Core
do e <- mkTextEncoding "WINDOWS-1251"
-- or "CP1251" depending on OS
hSetEncoding your'handle e
s <- hGetContents your'handle
t <- runX (readString [...] s >>> ...)
...
If you don't have a Handle but a ByteString (from Network.HTTP for
example), dump it into a file first, then use the above.
More information about the Haskell-Cafe
mailing list