[Haskell-cafe] Re: group-by (Was: Nested guards?)

Anthony Clayden anthony_clayden at clear.net.nz
Tue Dec 11 00:18:02 EST 2007


Henning Thielemann <lemming <at> henning-thielemann.de> writes:

> 
> 
> On Fri, 7 Dec 2007, Simon Peyton-Jones wrote:
> 
> > | And I think that the solution is not to make the language larger and 
larger
> > | everytime someone wants a feature but to give people the tools to provide
> > | features without language changes.
> >
> > Of course that would be even better!  (Provided of course the resulting
> > programs were comprehensible.)  Haskell is already pretty good in this
> > respect, thanks to type classes, higher order functions, and laziness;
> > that's why it's so good at embedded domain-specific languages.
> 
> When I learned about GROUP BY and HAVING in SQL with its rules about what
> is allowed in GROUP BY and SELECT I considered GROUP BY a terrible hack,
> that was just introduced because the SQL people didn't want to allow types
> different from TABLE, namely lists of tables. I try to convince my data
> base colleagues that GROUP BY can nicely be handled in relational algebra
> by allowing sets of sets and that this is a fine combinatorial approach. I
> [snip]

I agree with Henning that HAVING is a 'terrible hack', but then SQL altogether 
is a terrible hack. I would expect the Haskell approach to be based on the 
much sounder theoretical principles of Relational Algebra, and I applaud that 
Wadler+SPJ's 'Comprehensive Comprehensions' restricts itself to a subset of 
SQL that corresponds to Relational Algebra. In that context, GROUP BY is 
reasonably well-defined as a mapping from a table to a table. (The hack in SQL 
vintage 1975 is in trying to squeeze GROUP BY into the structure of SELECT ... 
FROM ... WHERE ..., the mess now can be blamed on trying to preserve backwards 
compatability.)

As that paper points out, HAVING is unnecessary - it's just a filter on the 
result set of group-by. And relational theorists agree that HAVING is 
unneccessary (see for example 'The Importance of Column Names', Hugh Darwen 
2003 from www.thethirdmanifesto.com).

It's crucial that in Relational Algebra everything is a table. (See Codd's 12 
rules). The result of GROUP BY we might want to pass to another GROUP BY, or 
JOIN to another table, etc -- or does Henning propose a hierarchy of sets of 
sets ... of tables, presumably with a hierarchy of HAVINGHAVING's?






More information about the Haskell-Cafe mailing list