Combinators (Was Re: Application letters at the Haskell workshop: suggestion)

Mon, 17 Sep 2001 15:42:15 +0200

Eray Ozkural wrote (on 16-09-01 17:44 +0300):
> On Sunday 16 September 2001 04:30 pm, Frank Atanassow wrote:
> > A bit off-topic, but after some experience using combinator parsers in
> > Haskell (not just Parsec) where the lexical and syntactical bits were done
> > in the same grammar, I concluded that the traditional separation between
> > the two, a la Lex and Yacc, does indeed have its merits for just this
> > reason: by stratifying the grammar you introduce an abstraction boundary
> > which, I think, agrees better with the way programmers, at least, have
> > learned to reason about syntax. 
> 
> [snip]
> 
> I don't think so. Trying to separate those into distinct functions is messy. 

It may well be messy for natural language parsing, as I know little about it,
but works well for formal languages, which are after all usually designed with
a 2-level stratification in mind. If language designers start ignoring that
convention, then stratifying a parser might well become less attractive, but
error-reporting would become more difficult too.

> Using combinators is elegant, and if you've read any papers about them you 
> should have seen that you don't need to emit any "abstract syntax trees" or 
> such. The more complex the language is, more benefits there are in a unified 
> formalism. In NLP, people have shown excellent uses of such formalisms. There 
> are works that span all the way from phonology to semantics.

I think combinators are elegant too. I'm not saying that we should go back to
using Lex and Yacc; I'm just saying that there is an advantage in stratifying
your parsers, no matter whether you use combinators or conventional tools. The
other two major merits I mentioned, namely using a unified formalism and
modularity, are things I don't want to lose. But you don't have to. It's as easy
to write a 2-level parser with combinators as it is with Lex/Yacc: you just
write a lexer and a syntax parser, and compose them as functions.

> About context sensitive grammars, I don't have a code that would show how it 
> is done for the kind of "context-sensitivity" found in programming languages 
> but perhaps somebody could make a small demonstration?

Sorry, I don't understand this sentence.

By "context-sensitive", I mean analyses such as type inference and strictness
analysis which depend on information like the set of free variables in a term,
or the principal type of a term, which are not context-free properties.

> I won't avoid advertising some of my personal opinion. You see, trying to 
> separate syntax and semantics in a compiler is only blind devotion to 
> Chomsky's non-sense preachings in linguistics. There is no such thing in the 
> world. There is syntax only if there is semantics. I recall an account of how 
> gcc people had to go back and forth between syntax and semantics. Too many 
> kludges to get it right. The truth is that:
>   a) semantics is as formal as syntax
>   b) syntax in most cases is a shadow of semantics
> Therefore, formalizing both at once is reasonable. Of course, this is the 
> naive explanation and the whole debate goes a lot further but I guess it is 
> enough for an already off-topic discussion.

Since I haven't read Chomsky, it would be hard to prove my blind devotion to
his teachings. ;)

Frankly I don't see how the syntax vs. semantics question quite fits in here,
but for the record I subscribe to the algebraic school of syntax and
semantics, particularly Lawvere's view, so I also acknowledge a close
relationship between the two. But let's avoid philosophical questions, shall we?

-- 
Frank Atanassow, Information & Computing Sciences, Utrecht University
Padualaan 14, PO Box 80.089, 3508 TB Utrecht, Netherlands
Tel +31 (030) 253-3261 Fax +31 (030) 251-379