[Haskell-beginners] hGetContents and modules architecture idea

Rein Henrichs rein.henrichs at gmail.com
Mon May 8 20:24:32 UTC 2017


Please allow me to provide a little more context to David's fine answer:

baa dg <aquagnu at gmail.com> writes:
> So, is hGetContents is legacy? My ponits were: 1) hGetContents IMHO
> should be in I/O related modules, not in "data definitions" modules.
> Because when somebody creates new modules, defining some new data
> types, he should not thing about I/O of these types. Even more, he
> don't know what kind of I/O and I/O functions he must to "inject"
> there. Is not it a design error in Haskell?

This is an instance of the expression problem[1]. We have a lot of IO
actions that read a Handle and return its contents as represented by
some Haskell data type. The question, then, is where do we put these
things? The three most promising options for organizing them are:

1. Put them all together in a module for doing I/O.
2. Put them together with the data type they return.
3. Factor out common functionality — such as reading a handle — and only
   implement what's different, e.g., coercing a stream of bytes
   (possibly interpreted via some encoding) into a given data type.

The first option is impossible — or at least impractical — because the
collection of data-types is large and fully extensible. This module
would need to depend on every package that provides data types, both
those currently existing and any developed in the future. The
maintenance burden on this module would be immense, as would be the
amount of friction introduced into the development of new data types.

The second option is the one we currently use for hGetContents, and it's
the one you dislike (for entirely legitimate reasons). But it's not the
only option provided by the language, or even the only option available
to you today! So I wouldn't say it's a design error of the language per
se (as the expression problem applies to all languages), just a
less-than-ideal choice made early on in the growth of the Haskell
ecosystem.

The third option solves the expression problem with type classes[2] and
other methods of abstraction, of which David has already shared a number
of examples. Factoring out the common I/O behavior reduces duplication
and increases modularity and while we're still left with an expression
problem — Where do the implementations of the type class instances live?
— it's a more tractable problem and solving it provides more value.

—
Rein Henrichs

[1] [https://en.wikipedia.org/wiki/Expression_problem]
[2] [https://userpages.uni-koblenz.de/~laemmel/TheEagle/resources/pdf/xproblem1.pdf]


More information about the Beginners mailing list