[Haskell-cafe] Gitit - Encoding

Conal Elliott conal at conal.net
Wed Dec 31 12:52:51 EST 2008


Aside:

> lookPairs :: RqData [(String,String)]
> lookPairs = asks fst >>= return . map (\(n,vbs)->(n,L.unpack $ inputValue
vbs))

Looks like an opportunity for semantic editor combinators [1].  Something
like

> lookPairs = (fmap.fmap.fmap) (L.unpack . inputValue) (asks fst)

Or specialize the edit path to (fmap.map.second) .

   - Conal

[1] http://conal.net/blog/semantic-editor-combinators

On Tue, Dec 30, 2008 at 6:14 AM, Jeremy Shaw <jeremy at n-heptane.com> wrote:

> Hello,
>
> I have not looked at the gitit source code, but I have had this
> problem in other HAppS applications. The problem is that by default
> HAppS does nothing about string encodings. The easy fix is to use
> utf-8 and unicode everywhere. ('easy' compared to supporting multiple
> encodings).
>
> The goal is to make sure that in gitit, a String is always a list of
> unicode code points, and not a list of utf-8 encoded octets. This
> means that whenever data comes in or goes out of gitit it needs to be
> decoded or encoded.
>
> To transition you need to do atleast the following:
>
> 1. Set the charset of the outgoing pages so that the browser knows
> that the pages is supposed to be utf-8:
>
>  For html, this can be done by adding this meta to the <head> of each page:
>
>  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
>
>  However, for text/plain, etc, you must set it in the HTTP header
>  (which I will cover later). For html, it is still useful to set the
>  meta tag though, so that if the page is saved to disk, the encoding
>  is not lost.
>
> 2. use the utf8-string library, and make sure that all the
> inputs/outputs are decoded/encoded properly.
>
> This probably means patching your copy of HAppS-Server (or copying the
> modified functions into gitit).
>
> For example, lookPairs currently looks like this:
>
> > lookPairs :: RqData [(String,String)]
> > lookPairs = asks fst >>= return . map (\(n,vbs)->(n,L.unpack $ inputValue
> vbs))
>
> As you can see, it just takes the incoming bytes and converts them to
> a String, but without doing any decoding. You probably want something
> more like:
>
> > lookPairs :: RqData [(String,String)]
> > lookPairs = asks fst >>= return . map
> (\(n,vbs)->(n,Data.ByteString.Lazy.UTF8.toString $ inputValue vbs))
>
> Some of the other look* functions need patching as well.
>
> Similarily, the ToMessage instances need to encode the outgoing data.
> Consider:
>
> > instance ToMessage Html where
> >    toContentType _ = B.pack "text/html"
> >    toMessage = L.pack . renderHtml
>
> We really want to make two changes:
>
> > instance ToMessage Html where
> >    toContentType _ = B.pack "text/html; charset=UTF-8"            -- add
> the encoding
> >    toMessage = Data.ByteString.Lazy.UTF8.fromString . renderHtml  --
> encode the data
>
> 3. make sure that any I/O (readFile, writeFile, etc) uses the utf-8
> functions from utf8-string.
>
> If you don't want to patch HAppS-Server, then you could work around it by
> doing silliness like:
>
>  do pairs' <- lookPairs
>    let pairs = map (first toString . second toString) pairs'
>
> but that seems error prone and not a long term solution. The obvious
> long term solution is for HAppS to fix its encoding issues. The simple
> fix is to hardwire it for utf-8, but a system that would supports
> arbitrary encodings might be nice?
>
> As far as I know, no one has even tried to submit a patch hardwiring
> HAppS to use utf-8 -- which seems like a good short-term solution. You
> might try posting on the HAppS mailing list and see if such a patch
> would be welcome:
>
> http://groups.google.com/group/HAppS
>
> hope this helps.
> - jeremy
>
>
> At Tue, 30 Dec 2008 13:58:15 +0100,
> Arnaud Bailly wrote:
> >
> > Hello,
> > I have started using Gitit and I am very happy with it and eager to
> > start hacking. I am running into a practical problem: characters
> > encoding. When I edit pages using accented characters (I am french),
> > the accents get mangled when the page come back from server.
> >
> > The raw files are incorrectly encoded. Where Shall I look for fixing
> > this issue ?
> >
> > Thanks
> >
> > ps: the wiki is live at http://www.notre-ecole.org(some of the other
> look funct
> >
> > --
> > Arnaud Bailly, PhD
> > OQube - Software Engineering
> >
> > web> http://www.oqube.com
> > _______________________________________________
> > Haskell-Cafe mailing list
> > Haskell-Cafe at haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20081231/2325ca5e/attachment.htm


More information about the Haskell-Cafe mailing list