[Haskell-beginners] hPutChar: invalid argument (Invalid or
incomplete multibyte or wide character
Daniel Fischer
daniel.is.fischer at web.de
Sun Jun 13 07:07:03 EDT 2010
On Sunday 13 June 2010 08:00:15, Erik de Castro Lopo wrote:
> HI all,
>
> I've managed to use the Curl bindings to pull down a web page, and I'm
> using TagSoup to parse it, but when I try to print the text in a TagText
> I get
>
> hPutChar: invalid argument (Invalid or incomplete multibyte or wide
> character)
>
> The code looks like:
>
> parsePage :: String -> IO ()
> parsePage page = do
> let tags = map deTag $ filter isTagText $ parseTags page
> mapM_ putStrLn tags
> where
> deTag (TagText s) = s
> deTag x = error $ "Bad Tag '" ++ show x ++ "' in deTag."
>
>
> This is with ghc-6.12.1 on Debian Linux.
>
> Any clues appreciated.
>
> Cheers,
> Erik
Probably the page you've tried it on wasn't encoded in your locale
encoding. If the page was in latin1 and your locale is UTF-8, there will
likely be invalid (for UTF-8) byte sequences in it.
For a locally stored page, the code above worked fine with tagsoup-0.6 and
tagsoup-0.10 when the page was utf-8-encoded, but if it was latin1-encoded
(and contained non-ASCII chars), it raised an
invalid argument (Invalid or incomplete multibyte or wide character)
error (on hGetContents, though, I suppose that's because I used readFile
and not th Curl-bindings).
More information about the Beginners
mailing list