[Haskell-cafe] Automatic fixity allocation for symbolic operators
Brian Hulley
brianh at metamilk.com
Sat Oct 14 11:23:56 EDT 2006
Hi -
I'm wondering if it is possible to construct a methodical procedure to
assign a fixity to symbolic operators so that we could get rid of the need
for user defined fixites. User defined fixities are an enormous problem for
an interactive editor, because it is not possible to construct a parse tree
of some code if you don't know what the fixities are, and an editor must be
able to do something useful with a code fragment without having to look
inside other modules (because the other modules may not yet have even been
written!). Also, the programmer or reader of code needs to be able to
understand each code fragment independently as well.
>From the Haskell98 report, the Prelude defines:
infixr 9 ., !!
infixr 8 ^, ^^, **
infixl 7 *, /, `quot`, `rem`, `div`, `mod`
infixl 6 +, -
infixr 5 :, ++
infix 4 ==, /=, <, <=, >=, >
infixr 3 &&
infixr 2 ||
infixl 1 >>, >>=
infixr 1 =<<
infixr 0 $, $!, `seq`
Suppose we ignore the varid operators and just consider the symbolic ops.
What I'm trying to find is a systematic way to assign fixities to all other
possible sequences of symbol characters that is consistent with what we've
already got in the Prelude.
As a first step, we could say that "mirrored" operators must share the same
precedence ie:
=<< >>=
< >
For associativity, we could assign each character an associativity weight:
-1 left associative
0 neutral
1 right associative
and say that the associativity is simply the sign of the sum of the
associativity weights of the characters thus:
> -1
= 0
< 1
=<< 0 + 1 + 1 ie infixr
Note that I don't care about non-associative ops since the non-associativity
of something should be a question for the type checker not the parser imho -
ideally we should be able to create a parse tree for any possible operator
expression.
To form the precedence, we could assign each character a precedence value
eg:
9 . !
8 ^
7 * /
6 + -
5 :
4 =
3 &
2 |
1 < >
0 $
A first attempt might be to say that the precedence is simply the decimal
expansion of the precedence values eg >>= has precedence 1.14 and $! has
precedence 0.9. However, as noted above, mirrored ops must share the same
precedence so an alternative is to create some ordering of characters such
that when the set of characters in an operator is sorted according to this
ordering, the decimal expansion of the precedence digits will give the same
relative ordering for operator precedences as the Prelude.
For example, using $ | & + - * / ^ . ! : < > = as the ordering, we'd get:
infixr 9 . !! 9 9.9
infixr 8 ^, ^^, ** 8 8.8 7.7
infixl 7 *, /, 7 7
infixl 6 +, - 6 6
infixr 5 : , ++ 5 6.6
infix 4 ==, /=, <, <=, >=, > 4.4 7.4 1 1.4 1.4 1
infixr 3 && 3.3
infixr 2 || 2.2
infixl 1 >>, >>= 1.1 1.14
infixr 1 =<< 1.14
infixr 0 $, $! 0 0.9
Although most existing ops get a similar precedence (eg ^ ^^ and ** are all
still in the same group relative to the other ops despite the fact that
within the group there is now an ordering) the main problem seems to be that
< <= >= > get precedences that bind too loosely and /= is totally wrong.
This problem aside, the above algorithm would give sensible precedences for
such things as:
:> <: 5.1
<+> <*> 6.11 6.11
where the use of <> neutralizes the associativity contribution of < or >
respectively (since (-1) + 1 == 0), giving us the intuitive associativity
we'd expect from the "interesting" character in the middle.
(The problem of /= getting 7.4 could be solved by putting / after = in the
order, to get 4.7, but unfortunately this would mean that since <> must be
before =, <> would be before / so </> would get the wrong precedence
compared to <*>)
Another issue is that there is no assignment of associativity weights such
that "*" is infixl but "**" is infixr (ditto + and ++) so perhaps we'd need
to associate each character with an associativity function. Similar to
precedences, we then define an associativity ordering and let the resulting
associativity be the sign of the composition of the sorted functions applied
to 1 eg:
^ const (-1)
* \x -> x * (-1)
= id
& const (-1)
< (+1)
> (+ (-1))
Then
* (\x -> x * (-1)) 1 === -1 ie left
** (\x -> x * (-1)) . (\x -> x * (-1)) $ 1 === +1 ie right
>>=
> > =
(+ (-1)) . (+ (-1)) . id $ 1 === -1
<*> -- remember ordering
* < >
(\x -> x * (-1)) . (+1) . (+ (-1)) $ 1
=== ((1 - 1) + 1) * (-1)
=== -1 ie left as required!!! :-)
Anyway this is as far as I've got so far trying to rationally reconstruct
the original Prelude precedences to achieve the golden aim of eliminating
the infinite problem of fixity declarations from source code... :-)
(Regarding `div` `seq` etc - I'd just assign them all the same precedence
because use of multiple varid ops in the same expression with different
precedences, or trying to combine them with symbolic ops, is just a highway
to confusion city imho. Note also that (seq x $ exp) is not only clearer but
is also one character shorter than (x `seq` exp))
The main open problem is finding an algorithm which assigns a good
precedence to >>= as well as to >= and /= and </> and <*>.... ;-)
Any ideas?
Thanks, Brian.
--
www.metamilk.com
More information about the Haskell-Cafe
mailing list