[Haskell-cafe] Re: File path programme
robert dockins
robdockins at fastmail.fm
Mon Jan 31 09:53:43 EST 2005
I have been ruminating on the various responses my attempted file path
implementation has generated. I have a design beginning to form in the
back of my head which attempts to address the file path problem as I lay
out below. Before I develop it any further, are there any important
considerations I am missing?
Here is my conception of the file name problem:
1) File names are abstract entities. There are a number of ways one
might concretely represent a filename. Among these ways are:
a) A contiguous sequence of octets in memory
(C style string on most modern hardware)
b) A sequence of unicode codepoints
(Haskell style string)
c) Algebraic datatypes supporting path manipulations
(yet to be developed)
2) We would like these three representations to be isomorphic.
Unfortunately, this cannot be. In particular, there are major issues
with the translations between the (a) and (b) forms given above. One
could imagine that translations issues involving the (c) form are also
possible.
3) Translations between (a) and (b) must be parameterized by a character
encoding. Translations to and from (c) will require some manner of
description of the path syntax, which differs by OS.
4) In practice, the vast majority of file paths are portable between the
various forms; the forms are "nearly" isomorphic, with corner cases
being fairly rare.
5) Translations between the various forms cost compute cycles and
memory, and are not necessarily bijective. Therefore, translations
should occur _only_ if absolutely necessary. In particular, if a file
name passes through a program as a black box (it is not examined or
manipulated) it should undergo no transformation.
6) Different OSes handle file names differently. These differences
should be accounted for, transparently where possible. These
differences, however, should be exposed to developers for whom the
difference matter.
7) Using simple file names should be easy. We don't want developers to
have to worry too much about character encodings, path separators, and
generally bizarre path syntax just to open files. The complexities of
correct file name handling should be hidden from the casual programmer.
However, developers interested in serious
portability/internationalization should be able to get down into the
muck if they need to.
More information about the Haskell-Cafe
mailing list