[Haskell-cafe] Explicitly calling syntactic equality on datatypes

Wed Sep 18 12:20:34 UTC 2019

> Then why not introduce a datatype which guarantees structurally the that
> value is normalised and use its Eq?

There are many reasons for it. The first two have to do with code clarity.

1. Adding a datatype means that anytime I wish to use the normalized  
sum from a sum that I know is already normalized, I have to prepend it  
with the data constructor. This is something that bothers me outside  
of the equality checking thing, because once is fine, twice is  
bearable, but when you end up having 5 unavoidable wrappers on  
everything you want to use, doing pattern matching becomes very  
annoying. Yes, you can make it less bad by creating intermediate  
functions that wrap and unwrap, but this does not always solve the  
problem.

2. If normalization is only used for equality, for instance, (or  
equality and few other things) then doing this creates unnecessary  
duplication of types. For example, I have two sum values, which I care  
not if they have been normalized or they haven't (in fact, I care that  
they are only normalized if it has been necessary to do so). At some  
point, I need to check their equality. With your approach, I need to  
create the normalized sum version and check equality there, and then  
of course I can continue using the original values, but the point is I  
have had to create the added type and explicitly construct it to check  
for equality. Because I can't just do (==) on the non-normalized sum  
values wherever I need to check for equality, I need to explicitly  
transform into normalized form and check equality, and at that point  
the type helps nothing.

3. It reduces reusability. Say that I need to check for equality many  
times or between a collection of elements (e.g. to sort them). I can  
normalize them all once in the beginning to avoid normalizing several  
times. But then, because they are a different type, I cannot apply to  
them the other operations that I have defined for the non-normalized  
type. Sure, I can lift them to the normalized type, and even if this  
is fairly straightforward, it implies duplicating all the functions  
that I could ever want to use on a pre-normalized sum (which, to be  
honest, could be basically all of them). Yes, I can always do clever  
things like create a type class or whatnot. But that is precisely my  
point: I am having to do complicated things and writing a lot of code  
for something that is very straightforward: normalize (here, I'll show  
you how), and then check for equality.

Of course, normalize on an already normalized sum is expected to be  
very quick, so pre-normalizing a sum that we know will be equality  
checked many times is gonna be both comfortable and easy. But in order  
to be able to do that, I have to manually implement syntactic equality  
on the side. Something which GHC *already knows how to do*. And as I  
said, this is fine for 1 type. Even 3. But for 15 it starts to be a  
concern. Of course it's not a major concern, but it's one of those  
things that seem like you could gain a lot from doing so little.

Also of course, if efficiency and reducing normalization to the very  
minimum are critical, then I would end up having a monad in which the  
value may or may not have already been normalized, and the first time  
normalization is necessary, I store the normalized value and use that  
onwards. But this is, again, too much complexity for something that  
should be simple.

Thanks for the reply anyway,
Juan.

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.