[Haskell-cafe] Reader monad, refactoring and missing the point all at once

Eugene Dzhurinsky jdevelop at gmail.com
Wed May 2 08:54:12 CEST 2012


Hi all!

Last day I was trying to fix idiii library, because it uses utf8 for parsing
non-unicode content. I found the functions

> -- | Parses one value and returns it as a 'String'
> parseString :: CharEncoding -> TagParser String
> parseString enc = do
>     v <- case enc of
>       0x01 -> parseUntilWord16Null -- UTF-16
>       0x02 -> parseUntilWord16Null -- UTF-16 BOM
>       _    -> parseUntilWord8Null  -- ISO-8859-1 or UTF-8
>     return $ encPack enc v
> 
> encPack :: CharEncoding -> [Token] -> String
> encPack 0x00            s  = Text.unpack $ decodeASCII   $ BS.pack s
> encPack 0x01 (0xFF:0xFE:s) = Text.unpack $ decodeUtf16LE $ BS.pack s
> encPack 0x01 (0xFE:0xFF:s) = Text.unpack $ decodeUtf16BE $ BS.pack s
> encPack 0x02            s  = Text.unpack $ decodeUtf16BE $ BS.pack s
> encPack _               s  = Text.unpack $ decodeUtf8    $ BS.pack s

updated the dependency from 
> import Data.Text.Encoding (decodeASCII, decodeUtf16LE, decodeUtf16BE, decodeUtf8)
to
> import Data.Text.ICU.Convert

and added implementation for decoding functions:

> decodeAny :: String -> BS.ByteString -> Text.Text
> decodeAny charset src = unsafePerformIO $ ((flip toUnicode) src) `fmap` open charset (Just True)
>
> decodeASCII :: BS.ByteString -> Text.Text
> decodeASCII = decodeAny "latin1"
> 
> decodeUtf16LE = decodeAny "utf-16le"
> 
> decodeUtf16BE = decodeAny "utf-16be"
> 
> decodeUtf8 = decodeAny "utf-8"

Now I want to add possibility to specify encoding to yse with decodeASCII. I was 
thinking of adding Reader monad and providing some sort of charset
configuration there - but it will lead up to complicating the code, which uses
this parseString function. And this code is used inside Parser of Text.ParserCombinators.Poly.State - 
so I will need to update all usages of this parser.

Another approach might be to use IORef with encoding stored there, but I
don't really like this solution.

What would be the best way of refactoring of such kind?

Thanks!

-- 
Eugene N Dzhurinsky
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120502/7faa553c/attachment.pgp>


More information about the Haskell-Cafe mailing list