[Haskell-cafe] Automatic fixity allocation for symbolic operators

Sat Oct 14 11:23:56 EDT 2006

Hi -
I'm wondering if it is possible to construct a methodical procedure to 
assign a fixity to symbolic operators so that we could get rid of the need 
for user defined fixites. User defined fixities are an enormous problem for 
an interactive editor, because it is not possible to construct a parse tree 
of some code if you don't know what the fixities are, and an editor must be 
able to do something useful with a code fragment without having to look 
inside other modules (because the other modules may not yet have even been 
written!). Also, the programmer or reader of code needs to be able to 
understand each code fragment independently as well.

>From the Haskell98 report, the Prelude defines:

    infixr 9  ., !!
    infixr 8  ^, ^^, **
    infixl 7  *, /, `quot`, `rem`, `div`, `mod`
    infixl 6  +, -
    infixr 5  :, ++
    infix  4  ==, /=, <, <=, >=, >
    infixr 3  &&
    infixr 2  ||
    infixl 1  >>, >>=
    infixr 1  =<<
    infixr 0  $, $!, `seq`

Suppose we ignore the varid operators and just consider the symbolic ops. 
What I'm trying to find is a systematic way to assign fixities to all other 
possible sequences of symbol characters that is consistent with what we've 
already got in the Prelude.

As a first step, we could say that "mirrored" operators must share the same 
precedence ie:

    =<<    >>=
    <         >

For associativity, we could assign each character an associativity weight:

    -1      left associative
    0        neutral
    1        right associative

and say that the associativity is simply the sign of the sum of the 
associativity weights of the characters thus:

    >   -1
    =    0
    <    1

    =<<    0 + 1 + 1    ie infixr

Note that I don't care about non-associative ops since the non-associativity 
of something should be a question for the type checker not the parser imho - 
ideally we should be able to create a parse tree for any possible operator 
expression.

To form the precedence, we could assign each character a precedence value 
eg:

    9     .    !
    8    ^
    7    *    /
    6    +    -
    5    :
    4    =
    3    &
    2    |
    1    <    >
    0    $

A first attempt might be to say that the precedence is simply the decimal 
expansion of the precedence values eg >>= has precedence 1.14 and $! has 
precedence 0.9. However, as noted above, mirrored ops must share the same 
precedence so an alternative is to create some ordering of characters such 
that when the set of characters in an operator is sorted according to this 
ordering, the decimal expansion of the precedence digits will give the same 
relative ordering for operator precedences as the Prelude.

For example, using $ | & + - * / ^ . ! : < > = as the ordering, we'd get:

    infixr 9  .    !!                               9    9.9
    infixr 8  ^, ^^, **                        8   8.8    7.7
    infixl 7  *, /,                                7    7
    infixl 6  +, -                                6     6
    infixr 5  : , ++                             5    6.6
    infix  4  ==, /=, <, <=, >=, >      4.4   7.4   1    1.4    1.4    1
    infixr 3  &&                                3.3
    infixr 2  ||                                    2.2
    infixl 1  >>, >>=                        1.1    1.14
    infixr 1  =<<                                1.14
    infixr 0  $, $!                                0    0.9

Although most existing ops get a similar precedence (eg ^ ^^ and ** are all 
still in the same group relative to the other ops despite the fact that 
within the group there is now an ordering) the main problem seems to be that 
< <= >= > get precedences that bind too loosely and /= is totally wrong.

This problem aside, the above algorithm would give sensible precedences for 
such things as:

    :>            <:               5.1
    <+>        <*>            6.11    6.11

where the use of <> neutralizes the associativity contribution of < or > 
respectively (since (-1) + 1 == 0), giving us the intuitive associativity 
we'd expect from the "interesting" character in the middle.

(The problem of /= getting 7.4 could be solved by putting / after = in the 
order, to get 4.7, but unfortunately this would mean that since <> must be 
before =, <> would be before / so </> would get the wrong precedence 
compared to <*>)

Another issue is that there is no assignment of associativity weights such 
that "*" is infixl but "**" is infixr (ditto + and ++) so perhaps we'd need 
to associate each character with an associativity function. Similar to 
precedences, we then define an associativity ordering and let the resulting 
associativity be the sign of the composition of the sorted functions applied 
to 1 eg:

    ^      const (-1)
    *      \x -> x * (-1)
    =      id
    &     const (-1)
    <      (+1)
    >      (+ (-1))

Then
    *            (\x -> x * (-1)) 1    ===    -1 ie left
    **          (\x -> x * (-1)) . (\x -> x * (-1)) $ 1 === +1 ie right

        >>=
        >            >                =
        (+ (-1)) . (+ (-1))  .   id        $ 1 === -1

        <*>            -- remember ordering
        *                            <              >
        (\x -> x * (-1))  . (+1)      . (+ (-1)) $ 1

        === ((1 - 1) + 1) * (-1)
        === -1 ie left as required!!! :-)

Anyway this is as far as I've got so far trying to rationally reconstruct 
the original Prelude precedences to achieve the golden aim of eliminating 
the infinite problem of fixity declarations from source code... :-)

(Regarding `div` `seq` etc - I'd just assign them all the same precedence 
because use of multiple varid ops in the same expression with different 
precedences, or trying to combine them with symbolic ops, is just a highway 
to confusion city imho. Note also that (seq x $ exp) is not only clearer but 
is also one character shorter than (x `seq` exp))

The main open problem is finding an algorithm which assigns a good 
precedence to >>= as well as to >= and /= and </> and <*>.... ;-)

Any ideas?

Thanks, Brian.
--
www.metamilk.com