[Haskell-cafe] UTF-8 problems when decoding JSON data coming from Network.HTTP

Ionut G. Stan ionut.g.stan at gmail.com
Sun Oct 17 08:26:37 EDT 2010


On 17/Oct/10 8:02 AM, Michael Snoyman wrote:
> In the gist you sent, the problem is that you are reading the HTTP
> response as a String. The HTTP library doesn't deal well with
> non-Latin characters when doing String requests; you should be using
> ByteString and then converting. It's a little tedious using the HTTP
> library with ByteStrings, which is one of the reasons I wrote
> http-enumerator. Here's some working code. The main point is to
> convert the UTF8 octets to a String.
>
> You could also consider using one of the JSON libraries that support
> bytestrings directly instead of strings, which will likely result in
> much better performance. Contenders include JSONb[1] and
> yajl-enumerator[2].
>
> import Network.HTTP.Enumerator
> import qualified Text.JSON as JSON
> import qualified Data.ByteString.Lazy.UTF8 as BSLU
>
> data GithubUser = GithubUser {
>          name     :: String,
>          location :: String
>      } deriving (Eq, Show)
>
>
> instance JSON.JSON GithubUser where
>      readJSON (JSON.JSObject object) =
>          let (Just a)          = lookupM "user" $ JSON.fromJSObject object
>              (JSON.JSObject b) = a
>              user              = JSON.fromJSObject b
>          in do name<- lookupM "name"     user>>= JSON.readJSON
>                location<- lookupM "location" user>>= JSON.readJSON
>                return $ GithubUser {
>                    name     = name,
>                    location = location
>                }
>
>      showJSON user = JSON.makeObj [
>                          ("name",     JSON.showJSON $ name user),
>                          ("location", JSON.showJSON $ location user)
>                      ]
>
>
> lookupM :: (Monad m) =>  String ->  [(String, a)] ->  m a
> lookupM x xs = maybe (fail $ "No such element: " ++ x) return (lookup x xs)
>
> main = do jsonLbs<- simpleHttp "http://github.com/api/v2/json/user/show/igstan"
>            let jsonText = BSLU.toString jsonLbs
>            let result = JSON.decode jsonText :: JSON.Result GithubUser
>            showResult result
>         where showResult (JSON.Ok json) = putStrLn $ name json
>               showResult (JSON.Error e) = putStrLn e
>
> Michael
>
> [1] http://hackage.haskell.org/package/JSONb-1.0.2
> [2] http://hackage.haskell.org/package/yajl-enumerator

Thanks Michael, now it works indeed. But I don't understand, is there 
any inherent problem with Haskell's built-in String? Should one choose 
ByteString when dealing with Unicode stuff? Or, is there any resource 
that describes in one place all the problems Haskell has with Unicode?

-- 
Ionuț G. Stan  |  http://igstan.ro


More information about the Haskell-Cafe mailing list