[Haskell-i18n] Proposal for a Unicode-safe layout rule

Keith Wansbrough Keith.Wansbrough@cl.cam.ac.uk
Mon, 04 Aug 2003 10:16:03 +0100


> If TAB is treated as layout-unsafe (as it should be) then this rule change
> will break some existing code, but only code that deserves to be broken.
> If TAB is treated specially as it currently is, this change should not
> break any existing code.

Alternative: make the column value a partial order: identical white
space is equal indent, additional white space increases the indent and
less decreases it, but different white space is incomparable.  In
other words, you can use whatever mix of spaces and tabs you like, as
long as you use it consistently.  Eight spaces, four spaces, and one
tab are all different, but if you always use one tab you're fine.
Like this:

-- suggested whitespace partial order for layout

import Char
import List

-- |Opaque indent value type
data Indent = Indent String deriving Show

-- |Given a line, compute the indent of that line
indentOf :: String -> Indent
indentOf s = Indent (takeWhile isSpace s)

-- |Compare two indents (NB: partial order)
cmpIndent :: Indent -> Indent -> Maybe Ordering
cmpIndent (Indent s1) (Indent s2) | s1 == s2         = Just EQ
                                  | isPrefixOf s1 s2 = Just LT
                                  | isPrefixOf s2 s1 = Just GT
                                  | otherwise        = Nothing

This way, it doesn't matter how wide any whitespace character is, or
even what tab width you use.  The system doesn't even need to know -
and any Unicode whitespace character can be used, not just the Latin-1
"isSpace" ones.

Comments?

--KW 8-)