[Haskell-cafe] Re: group-by (Was: Nested guards?)
anthony_clayden at clear.net.nz
Tue Dec 11 00:18:02 EST 2007
Henning Thielemann <lemming <at> henning-thielemann.de> writes:
> On Fri, 7 Dec 2007, Simon Peyton-Jones wrote:
> > | And I think that the solution is not to make the language larger and
> > | everytime someone wants a feature but to give people the tools to provide
> > | features without language changes.
> > Of course that would be even better! (Provided of course the resulting
> > programs were comprehensible.) Haskell is already pretty good in this
> > respect, thanks to type classes, higher order functions, and laziness;
> > that's why it's so good at embedded domain-specific languages.
> When I learned about GROUP BY and HAVING in SQL with its rules about what
> is allowed in GROUP BY and SELECT I considered GROUP BY a terrible hack,
> that was just introduced because the SQL people didn't want to allow types
> different from TABLE, namely lists of tables. I try to convince my data
> base colleagues that GROUP BY can nicely be handled in relational algebra
> by allowing sets of sets and that this is a fine combinatorial approach. I
I agree with Henning that HAVING is a 'terrible hack', but then SQL altogether
is a terrible hack. I would expect the Haskell approach to be based on the
much sounder theoretical principles of Relational Algebra, and I applaud that
Wadler+SPJ's 'Comprehensive Comprehensions' restricts itself to a subset of
SQL that corresponds to Relational Algebra. In that context, GROUP BY is
reasonably well-defined as a mapping from a table to a table. (The hack in SQL
vintage 1975 is in trying to squeeze GROUP BY into the structure of SELECT ...
FROM ... WHERE ..., the mess now can be blamed on trying to preserve backwards
As that paper points out, HAVING is unnecessary - it's just a filter on the
result set of group-by. And relational theorists agree that HAVING is
unneccessary (see for example 'The Importance of Column Names', Hugh Darwen
2003 from www.thethirdmanifesto.com).
It's crucial that in Relational Algebra everything is a table. (See Codd's 12
rules). The result of GROUP BY we might want to pass to another GROUP BY, or
JOIN to another table, etc -- or does Henning propose a hierarchy of sets of
sets ... of tables, presumably with a hierarchy of HAVINGHAVING's?
More information about the Haskell-Cafe