[Haskell-cafe] Advantages of using qualified names and a uniform naming convention

Henning Thielemann lemming at henning-thielemann.de
Mon Sep 4 05:22:49 EDT 2006

On Mon, 4 Sep 2006, Brian Hulley wrote:

> Firstly, I found the following advice by Henning Thielemann very useful in my
> own code:
>   http://haskell.org/hawiki/UsingQualifiedNames (bottom of the page)
>   In the style of Modula-3 I define one data type or one type
>   class per module. The module is named after the implemented
>   type or class. Then a type is named T, and a type class C.
>   I use them qualified, e.g. Music.T or Collection.C.
>   Similarly, if a type has only one constructor then I call it
>   Cons and use it qualified MidiFile.Cons [I don't agree with
>   this last suggestion].
>   This style also answers the annoying question whether the
>   module name should be in singular or plural form:
>   Always choose singular form!


... and thanks for promoting that naming style. :-)  (I don't want to call
it "my style", but the only Haskell libraries that use it and are used by
me, are those, I have written myself. :-)

> where the only thing I change is that the name of the value constructor for a
> type with only one value constructor should be the same as the name of the
> type constructor eg newtype T a = T (a->Int) which seems to be the normal
> convention anyway and seems better imho than introducing a different
> identifier for the value constructor when the namespaces for values and
> types/classes are already distinct.

I agree. I have chosen Cons, because I consider T as an abbreviation of
'Type', and 'Type' would be not a good name for a constructor. Initially I
used C, but found that this is better for type classes.

> It probably goes without saying that with the use of qualified imports,
> symbols are absolutely gross and should be avoided at all costs. They're
> totally unreadable, not just because they're a squiggly unpronouncable mess,
> but because you need to simulate an operator parser in your head to discover
> what's being applied to what. Someone could even define <+> to bind tighter
> than <*>, so it's not even safe to rely on normal conventions, and different
> modules in a program could use the same symbol in totally different ways with
> different precedences, leading to a real headache and unnecessary bugs when
> jumping between code in an editor. For readable code, plenty of descriptive
> words and parentheses are surely preferable. The only exception I'd make is
> the use of >>=, >> (which is so fundamental the alternative do notation is
> built into the language), ($), ($!), (.), and common arithmetic ops.

I also like to remind a sparingly usage of infix operators, because of
these reasons. I want to add that precedences are not only a problem for
human readers but also for tools. Imagine a source code formatting tool
which respects precedences. It shall format
instead of
 in order to highlight sub-expressions. However, this is only possible if
the tool knows the precedences imported from all modules. That is, a
module alone can be formatted correctly only if all imported modules are
known and the tool must be able to fetch this information from the other
modules (which are possibly present only in compiled form).


> Thus I propose that (null) should actually have been called "isEmpty",
> so that the relationship with the use of (empty) to denote the empty set
> is immediately apparent, and the use of the word "is" would immediately
> tell you that the function is a predicate.

I like that, too!

> It might even be advantageous to reserve more characters for use in
> identifiers (since the infamous ASCII symbols are so abhorrent anyway ;-) ) so
> we could have a similar rule to Scheme, that predicates would end with a
> question mark thus relieving us of the need to decide between "is" and "has"
> (to try and eliminate as much of the messiness and indecision caused by
> natural language as possible), though of course this would be a more long term
> idea eg:
>     -- So related things appear alphabetically together...
>    empty? :: Set a -> Bool
>    empty :: Set a

I think the separation of alpha-numeric characters and other symbols
simplifies things, and shall be preserved.

> It could be argued that it would be more in keeping with left-to-right
> thinking to put the '?' first but then we'd lose the "related things should be
> together in any alphabetical list of functions/values", though such a
> compromise is already necessary when using "isEmpty" rather than "emptyIs"
> which would perhaps just sound too unnatural ;-)

In Mathematica 'is' functions are denoted by trailing 'Q' for 'Query'. May
this be an option?

> Moving on to Data.List, we find a confusion of different spatial, temporal,
> and historical viewpoints jostling valiantly for supremacy in the programmer's
> mind:
>    head tail        -- funny cartoon-like image of a list
>    last init          -- temporal
>    foldl foldr      -- spatial (left to right)
> Despite the fact that (last, init) is somehow the dual of (head, tail), we
> have to switch from a temporal conception of a list to a historical "cartoon"
> conception to move between them. Then we come to foldl, foldr where we think
> in terms of left and right.

Concerning pairs of identifiers, I want to use the opportunity again to
    including some thoughts about Show/Read, where I prefer pairs like
Read/Write, Show/Hide, Parse/Format would be better according to my
limited knowledge of the English language.

> Therefore I propose that for a list there should only be one underlying
> concept, that of a spatial sequence going from left to right, and that all
> functions should be named in accordance with this alone ie:
>    atL atLs        -- similar to (x:xs)

I may mix atLs up with today's 'init', because it sounds like "many
elements beginning from the left".

> A discussion of naming in Data.List would certainly be incomplete without a
> mention of (nub). It must surely rank as one of the most peculiar identifiers
> in the whole history of programming. The documentation thankfully explains
> that it means "essence", but then goes on to say that it is in fact just a
> function for removing duplicates from a list. The mind can only convulse in
> the most tortuous configurations to try and reverse engineer this strange
> correspondence. Would the simple name (removeDuplicates) or even (asSet) not
> be much easier all round?

I vote for 'removeDuplicates'. The name 'nub' might be a reason for that
some Haskell programmers don't know about this useful function.

> The problem here is that these names, presumably both to do with "Car", are
> not going to appear next to each other in any alphabetical listing (if there
> are other names too), whereas:
>    type CarBlue = ...
>    type CarRed = ...
> will. Thus the position of the adjective in natural language (in this case
> English) has to be ignored if you want a programming environment to display
> related things together.

Good point. Natural languages sometimes places things in `wrong order`,
that is less important things first. Maybe you know that in German numbers
are spelled in quite a mixed order, that is hundreds-ones-tens.

> A great thing about Haskell is that it's quite simple to start renaming things
> already without having to change existing modules. In my own code I've simply
> just created new modules for IORef and Unique that re-export the original
> module contents but with the unsuffixed names, then I import from a hierarchy
> optimistically called "Prime" instead of the normal hierarchy :-)

In order to export types with other names, you have to use 'type'
assignments, and a type synonym cannot be an instance of a type class.  
That is these modules cannot replace the original ones. :-(

More information about the Haskell-Cafe mailing list