[Haskell] haskell communities worthy of academic study?

Claus Reinke claus.reinke at talk21.com
Sat Mar 31 21:28:33 EDT 2007

  Asked how she had come to choose GHC as the topic for her
  award-nominated PhD dissertation, freshly graduated doctor of
  software archeology Simone Tolduso revealed:

  "At first, there were a few small curiosities that triggered my
  interest, like why were darcs patches sent to the cvs-ghc
  mailinglist, or why did GHC releases traditionally bundle
  the predecessor of the current Cabal version when the missing
  libraries depended on its successor?  

  But then I looked into the repository, with its layers on layers of
  build systems, source formats, deprecation warnings, directory
  structure fragments, todo logs, broken builds resulting either from
  OS-tools advancing and playing havoc with the built-in assumptions
  of fragile build configurations or from multiple, partially
  completed, mutually incompatible heart-liver-and-lung transplants
  supporting the newest language extensions (which of course were all
  needed to build the compiler branch supporting said features, and
  whose documentation tended to be spread over user manual, API
  comments, mailing list threads, research papers, plus half a dozen
  different Wikis and ticket trackers), supported by often outdated
  documentation in a never-ending variety of formats, and I knew I had
  stumbled onto a goldmine. 
  Not to mention remains of earlier projects (what were fptools, or
  libraries?), a variety of test and compilation languages (including
  Haskell, C, Perl, Python, alongside the usual scripting suspects),
  or the proliferation of sediment layers into user space by the
  simple, but ingeneous, means of binary incompatibility. In spite of
  its comparatively small size, the project was beginning to rival the
  complexities of other Microsoft products of the same period.

  In what seems to have been an attempt to push open source ideas to
  their logical conclusion, you actually had to guess at the right
  combination of versions for a number of independently evolving
  toolchains, libraries, OSes, and use those to bootstrap from a
  consistent snapshot of the compiler, library, and sometimes even
  tool sources, or nothing works - a situation which was later
  increasingly exacerbated by the dispersion of the Haskell Cabal
  replacing coordinated releases. Preliminary mining of the relevant
  mailinglist and bug tracker archives suggests that binary releases
  were mainly public data points used to indicate intermediate states
  of GHC _not_ suitable for specific applications (apart from the
  obligatory Cabal pre-version lacking the new features needed for
  installing the extra libraries, other examples include versions of
  Data.ByteString _not_ based on the famous paper, _not_ supporting
  essential optimisations, or _not_ supporting API safety fixes). So
  there seemed to be no way to avoid direct access to the source
  repositories with their associated build processes and toolchains.

  And let us not forget that, unlike the programmers at the time, we
  are in the fortunate situation of already having complete
  repositories for the pieces and dependencies involved.  Finding
  matching versions is a non-trivial, but essentially combinatorial
  exercise, while for them, the process of building GHC would often
  have involved developing and submitting the patches that make up our
  repositories of all the pieces of software GHC builds depended on.
  We still haven't found the key that enabled the ancients to navigate
  this labyrinth and to keep their toolchains up to date while still
  making any progress in their daily work, not to mention recording
  such progress via darcs (in itself written in Haskell, and not free
  of troubles). Agent-based simulations of developer communities at
  arbitrary slices through the repositories show the majority of
  agents getting stuck in a recursive cycle of installing, debugging,
  and updating dependency chains without ever reaching a productive
  state, so we do know that we are missing some crucial information.

  Several of my correspondents have come to favour the somewhat
  controversial theory that the general programmer in those days
  must have been substantially more intelligent than people are
  today. And it does make sense, in a way - I mean, if anyone had
  been the slightest bit bothered by all this complexity, surely
  someone would have tried to simplify things?

  Of course, my work has not all been happy progress: for instance,
  while there really was an 'evil mangler', the equally persistent
  rumour that GHC was named after some scottish town has turned out
  to be a wild goose chase (cf Appendix GC); my colleagues in dirt
  archeology assure me there was no town called 'glorious'. The
  'real' archeologists, as they call themselves, had a field day
  laughing about my gullability there.  Still, there are so many
  burried treasures in this area - just waiting to be investigated."

  Dr Tolduso is currently working on a follow-on project, "Haskell
  by committee - design and syntax through the ages".

  Dept. of Software Archeology, University of New Atlantis
  (for immediate release)

More information about the Haskell mailing list