[xmonad] spawn functions are not unicode safe
Khudyakov Alexey
alexey.skladnoy at gmail.com
Thu Jan 22 18:46:39 EST 2009
> Well put. But it would be best to keep Xmonad encoding agnostic.
> Filenames are byte sequences, so spawn should take [Word8] instead of
> [Char]. Otherwise the best it can do is convert [Char] to [Word8] by
> truncating each Char. Nothing guarrantees that every executable has
> an UTF-8 name, after all, even if the default system locale is UTF-8.
>
It's good thing to keep xmonad encoding agnostic. This is surely out of scope
of window manager. Making spawn to accept [Word8] instead of String does not
solve problem. It's just reformulate it and make more obvious. Converting
[Char] to [Word8] is string encoding ;-).
> It's better to investigate where the argument of spawn comes from, and
> handle the problem there. Xlib knows about character encodings, maybe
> we could use its facilities and thus avoid adding further dependencies.
Arguments of spawn can come from anywhere. User input, hardcoded strings
etc... This sources are far too numerous to handle encoding by themselves.
As for xlib. AFAIK everything is OK with it. Real problem is standard
library - putStrLn/executeFile/etc...
Only solution which allow to keep xmonad encoding agnostic and still allows to
encode strings is to push choice of encoding to user. At least I could not
imagine anything else.
I've come up with two possible solutions.
1. Add encoding field to XConfig.
> data XConfig l = XConfig {
> -- skipped
> encoding :: String -> String -- User supplied encoding
> }
Downsides - requires to pass config to every function like spawn => change API
=> break awfully lots of code.
2. Something along lines:
> -- IORef which stores encoding functions. It's value should be set
> -- from user's config and never modified.
> coding :: IO (IORef (String -> String))
> coding = newIORef id
>
> -- Function to get encoder. Should be the only way to obtain it.
> getEncoder :: IO (String -> String)
> getEncoder = readIORef coding
>
> -- putStrLn which works with unicode string (sometimes)
> encodedPutStrLn :: String -> IO ()
> encodedPutStrLn str = do encode <- getEncoder
> putStrLn . encode $ str
Upside: break nothing. Only change few functions and provide way to set
encoding.
Downside: kludge.
Comment, suggestions, ideas are welcome.
P.S. I found this in documentation for XMonad.Util.XSelection. So not only
spawn affected
* Unicode handling is busted. But it's still better than calling
'chr' to translate to ASCII, at least.
As near as I can tell, the mangling happens when the String is
outputted somewhere, such as via promptSelection's passing through
the shell, or GHCi printing to the terminal. utf-string has IO functions
which can fix this, though I do not know have to use them here. It's
a complex issue; see
http://www.haskell.org/pipermail/xmonad/2007-September/001967.html
http://www.haskell.org/pipermail/xmonad/2007-September/001966.html.
More information about the xmonad
mailing list