[Haskell-cafe] Diving into the records swamp (possible GSoC project)
AntC
anthony_clayden at clear.net.nz
Sat Apr 27 12:23:46 CEST 2013
> Johan Tibell <johan.tibell <at> gmail.com> writes:
>
> Instead of endorsing one of the listed proposals directly, I will
emphasize the problem, so we don't lose sight of it. The problem people
run into *in practice* and complain about in blog posts, on Google+, or
privately when we chat about Haskell over beer, is that they would like to
write a record definition like this one:
>
> data Employee = Employee { id :: Int, name :: String }
>
> printId :: Employee -> IO ()
> printId emp = print $ id emp
>
> but since that doesn't work well in Haskell today due to name
> collisions, ...
[I've a bit more to say on that record definition below.]
Thank you Johan, I agree we should keep clear sight of the problem. So
let's be a bit more precise: it's not exactly the record declaration that
causes the name collisions, it's the field selector function that gets
created automatically. (Note that we can use xDisambiguateRecordFields to
access fields to, errm, disambiguate.)
So I did put in a separate proposal [3] (and ticket) on that very narrow
issue. (Simon M pointed out that I probably didn't name it very well!)
Even if we do nothing to advance the "records swamp", PLEASE can we
provide a compiler option to suppress that function.
I envisage it might facilitate a 'cottage industry' of Template Haskell
solutions (generating Has instances), which would be a cheap and cheerful
way to experiment in the design space.
[3]
http://hackage.haskell.org/trac/ghc/wiki/Records/DeclaredOverloadedRecordFi
elds/NoMonoRecordFields
(There are bound to be some fishhooks, especially around export/import of
names from a module with no selector functions to one that's expecting
them.)
[cont from above]
> ... the best practice today is to instead write something like:
>
> data Employee = Employee { employeeId :: Int, employeeName ::
String }
>
> printId :: Employee -> IO ()
> printId emp = print $ employeeId emp
>
> The downsides of the latter have been discussed elsewhere, but briefly
they are:
>
> * Overly verbose when there's no ambiguity.
> * Ad-hoc prefix is hard to predict (i.e. sometimes abbreviations of the
data type name are used).
I don't entirely agree with your analysis.
* fields named `id' or `name' are very likely to clash,
so that's a bad design (_too_ generic).
* If you've normalised your data model [**],
you are very likely to want exactly the same field
in several records
(for example employeeId in EmployeeNameAddress,
and in EmployeePay and in EmployeeTimeSheet.)
[And this use case is what TP/DORF is primarily aimed at.]
[**] Do I need to explain what data model normalisation is? I fear that so-
called XML 'databases' mean academics don't get taught normalisation any
more(?)
AntC
More information about the Haskell-Cafe
mailing list