[Haskell-cafe] UTF-8 problems when decoding JSON data coming
from Network.HTTP
Ionut G. Stan
ionut.g.stan at gmail.com
Sun Oct 17 08:26:37 EDT 2010
On 17/Oct/10 8:02 AM, Michael Snoyman wrote:
> In the gist you sent, the problem is that you are reading the HTTP
> response as a String. The HTTP library doesn't deal well with
> non-Latin characters when doing String requests; you should be using
> ByteString and then converting. It's a little tedious using the HTTP
> library with ByteStrings, which is one of the reasons I wrote
> http-enumerator. Here's some working code. The main point is to
> convert the UTF8 octets to a String.
>
> You could also consider using one of the JSON libraries that support
> bytestrings directly instead of strings, which will likely result in
> much better performance. Contenders include JSONb[1] and
> yajl-enumerator[2].
>
> import Network.HTTP.Enumerator
> import qualified Text.JSON as JSON
> import qualified Data.ByteString.Lazy.UTF8 as BSLU
>
> data GithubUser = GithubUser {
> name :: String,
> location :: String
> } deriving (Eq, Show)
>
>
> instance JSON.JSON GithubUser where
> readJSON (JSON.JSObject object) =
> let (Just a) = lookupM "user" $ JSON.fromJSObject object
> (JSON.JSObject b) = a
> user = JSON.fromJSObject b
> in do name<- lookupM "name" user>>= JSON.readJSON
> location<- lookupM "location" user>>= JSON.readJSON
> return $ GithubUser {
> name = name,
> location = location
> }
>
> showJSON user = JSON.makeObj [
> ("name", JSON.showJSON $ name user),
> ("location", JSON.showJSON $ location user)
> ]
>
>
> lookupM :: (Monad m) => String -> [(String, a)] -> m a
> lookupM x xs = maybe (fail $ "No such element: " ++ x) return (lookup x xs)
>
> main = do jsonLbs<- simpleHttp "http://github.com/api/v2/json/user/show/igstan"
> let jsonText = BSLU.toString jsonLbs
> let result = JSON.decode jsonText :: JSON.Result GithubUser
> showResult result
> where showResult (JSON.Ok json) = putStrLn $ name json
> showResult (JSON.Error e) = putStrLn e
>
> Michael
>
> [1] http://hackage.haskell.org/package/JSONb-1.0.2
> [2] http://hackage.haskell.org/package/yajl-enumerator
Thanks Michael, now it works indeed. But I don't understand, is there
any inherent problem with Haskell's built-in String? Should one choose
ByteString when dealing with Unicode stuff? Or, is there any resource
that describes in one place all the problems Haskell has with Unicode?
--
Ionuț G. Stan | http://igstan.ro
More information about the Haskell-Cafe
mailing list