[Haskell-cafe] HXT and xhtml page encoded in cp1251

Albert Y. C. Lai trebla at vex.net
Wed Apr 20 00:24:47 CEST 2011


On 11-04-18 05:06 PM, Dmitry V'yal wrote:
> The readDocument arrow fails with the following message:
>
> fatal error: encoding scheme not supported: "WINDOWS-1251"
>
> Can someone suggest a workaround for my use case?

If you have a Handle (from file or Network for example),

import System.IO(hGetContents, hSetEncoding, mkTextEncoding)
import Text.XML.HXT.Core

do e <- mkTextEncoding "WINDOWS-1251"
    -- or "CP1251" depending on OS
    hSetEncoding your'handle e
    s <- hGetContents your'handle
    t <- runX (readString [...] s >>> ...)
    ...

If you don't have a Handle but a ByteString (from Network.HTTP for 
example), dump it into a file first, then use the above.



More information about the Haskell-Cafe mailing list