[Haskell-cafe] Re: Allowing hyphens in identifiers

Thu Dec 17 22:39:21 EST 2009

On Dec 17, 2009, at 4:45 AM, Ben Millwood wrote:
> By the way, I like camelCase because I think that in most cases you
> *don't* want to break identifiers up into their component words

My experience has been that in order to make sense of someone else's
code you *HAVE* to break identifiers into their component words.
With names like (real example) ScatterColorPresetEditor, the eye
*can't* take it in at once, and telling the difference between that
and ScatterColorPresentEditor would be a pain.  Break them up
Ada-style as Scatter_Colour_Preset_Editor and
Scatter_Colour_Present_Editor and you're away laughing.

Here's another *real* baStudlyCaps identifier taken from a real
(and useful) program someone else wrote:
WEKA_AttributeSelectionEvaluationRanker_SymmetricalUncert

> - you
> read and understand what the function does once, and then you use it
> as a word in its own right.

I defy anyone to recognise
WEKA_AttributeSelectionEvaluationRanker_SymmetricalUncert
as a word in its own right.

> Any resemblance to actual English is
> really just a mnemonic of sorts.

Take a look at another typical example of the genre:
TransformerFactoryConfigurationError
"transformer factory configuration error" -- the resemblance
to actual English is more than accidental.

A sample of a little over 2000 Java class names had this
length distribution.

  1: 10
  2: 9
  3: 35
  4: 55
  5: 53
  6: 72
  7: 97
  8: 114
  9: 127
10: 158
11: 116
12: 145
13: 150
14: 131
15: 120
16: 109
17: 96
18: 84
19: 56
20: 57
21: 53
22: 53
23: 37
24: 21
25: 29
26: 8
27: 9
28: 5
29: 5
31: 2
33: 3
35: 1
36: 3
37: 2
39: 2
44: 2
47: 1
48: 1
49: 1
50: 1
57: 1

Looking at the number of words in each of these "phrases",
the distribution is
  1: 334  16.5%
  2: 893  44.1%
  3: 578  28.6%
  4: 161   7.5%
  5:  43   2.1%
  6:  17   0.8%
  7:   8   0.4%

Now on a sample of English text (the WSJ collection),
the commonest word length was 3 letters and the longest
was 26.  These Java class names are clearly quite unlike
the words we're used to reading (which would count against
my ideas about readability except that they _are_ quite
like English _phrases_).

I don't have similar figures handy for Haskell.  You thought
myExamplesLikeThisAreMadeUp?  Nope, things that long and with
that many words in the phrase are _not_ rare in seriously
baStudly code.

If you can see a 57-character 7-word class name as a single
"word", you have very different eyes from me.