[xmonad] Migrate from String to Data.Text [proposal]
Brent Yorgey
byorgey at gmail.com
Fri Dec 18 17:39:34 UTC 2015
I would also note that it's not as if xmonad developers are unaware of the
existence of Text. The fact is that the first version of xmonad was
released two years prior to the first release of text (2007 vs 2009). So
at the time there was simply no alternative to String. By the time text
became stable and widely accepted, xmonad-contrib was already quite large.
I think converting all of xmonad-contrib from String to Text would be a
much larger and more tedious undertaking than you seem to think. Even if
someone put in the effort to do that, it would indeed break pretty much
every user config ever, for little benefit. By and large, xmonad users
(some of whom do not even know very much Haskell) have come to expect
extreme stability from xmonad. (I have been running essentially the same
config unchanged for many, many years.) Forcing a bunch of people with
little Haskell experience to upgrade their configs from String to Data.Text
would probably result in many of them abandoning xmonad.
Personally, I definitely prefer doing the Right Thing over preserving
backwards compatibility (witness how many breaking changes we routinely
introduce with each new release of the diagrams library). But I just don't
think this makes sense for xmonad.
-Brent
On Fri, Dec 18, 2015 at 11:12 AM Matt Walker <matt.g.d.walker at gmail.com>
wrote:
> Hi everyone,
>
> I noticed that xmonad and xmonad-contrib both prefer to use of String =
> [Char] for their stringy data-types. This is probably a terrible idea. I
> cite some sources here, then outline their arguments below.
>
> http://www.alexeyshmalko.com/2015/haskell-string-types/
> https://mail.haskell.org/pipermail/haskell-cafe/2014-June/114745.html
>
> __String is Bad__
>
> 1) Char is a horribly inefficient representation of a character, being an
> entire machine word in length (at least 32 bits). Actually, it's worse:
> each Char takes up _two_ machine words in GHC, since it needs one to store
> GC information in. See the slide in the first link for more details.
> Data.Text stores the characters in compact arrays.
>
> 2) Lists are lazy, which makes their evaluation slower. You have to thunk
> on each character, which is pretty silly most of the time. Normally you
> want to read in at least _chunks_ of string all at once. Data.Text is
> strict, but Data.Text.Lazy exists and is (as you would assume) lazy when
> you need it.
>
> The long and the short of it is that [Char] is a suboptimal choice to use
> for anything except possibly short identifiers; Haskell (via GHC) is a
> compiled language, and yet performs orders of magnitudes worse than even
> Perl and Python on text processing when using Data.String. There is simply
> no good reason to use String when Text exists.
>
> __Alternatives__
>
> The other alternative is ByteString. Although ByteString is a great type
> for binary data, and specifically for data exchange protocols, it seems
> that it would inappropriate in this situation, due to the replacement of
> most (if not all) instances being actual textual data, which obviously Text
> is optimized for.
>
> __Migration Issues__
>
> Assuming we can agree that Text > String then, the main problem to
> switching would be the pain of migration, and whether this would be worth
> it. I argue it wouldn't be so bad, and is worth doing on principle alone.
>
> The LANGUAGE pragma of OverloadedStrings allows you to use String literals
> as Text literals, so that wouldn't be the main problem. The main issue is
> changing all the interfaces so they accept Text instead of String, and how
> this would impact existing user configs, and the xmonad-contrib archive.
> Every time you use ++ you would have to replace it with <>, the Monoid
> infix mappend operator. I doubt many people use : to build Strings, but in
> those instances those would have to be changed too. Finally, pattern
> matching on Strings like (x:xs) would break as well. All other functions
> would require changing from their String/List counterpart to the Text one.
> Since the names clash, one would have to import qualified as, for instance,
> T and call T.intersperse or whatever. It would be a non-trivial
> undertaking, but certainly doable.
>
> __Other Breaking Changes__
>
> Are there other niggling issues that exist in the codebase that would
> cause breaking changes? Perhaps it would be a good idea to get a list of
> them all and see if it's worth breaking backwards compatibility to fix them
> all at once? I'm a purist when it comes to code, but I would like to hear
> what other people think, and just how angry they would be with this
> change. I have no idea as to what xmonad and xmonad-contrib's breaking
> changes policy is.
>
> Obviously I'm not proposing this change be undertaken for 0.13 -- I was
> aiming for more 0.14 or later.
>
> Let me know what you think.
>
> Sincerely,
> Matt
> _______________________________________________
> xmonad mailing list
> xmonad at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/xmonad
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/xmonad/attachments/20151218/e0cf6752/attachment.html>
More information about the xmonad
mailing list