[Haskell-cafe] Re: File path programme

robert dockins robdockins at fastmail.fm
Mon Jan 31 09:53:43 EST 2005


I have been ruminating on the various responses my attempted file path 
implementation has generated.  I have a design beginning to form in the 
back of my head which attempts to address the file path problem as I lay 
out below. Before I develop it any further, are there any important 
considerations I am missing?

Here is my conception of the file name problem:

1) File names are abstract entities.  There are a number of ways one 
might concretely represent a filename. Among these ways are:

       a) A contiguous sequence of octets in memory
            (C style string on most modern hardware)
       b) A sequence of unicode codepoints
            (Haskell style string)
       c) Algebraic datatypes supporting path manipulations
            (yet to be developed)

2) We would like these three representations to be isomorphic. 
Unfortunately, this cannot be.  In particular, there are major issues 
with the translations between the (a) and (b) forms given above.  One 
could imagine that translations issues involving the (c) form are also 
possible.

3) Translations between (a) and (b) must be parameterized by a character 
encoding.  Translations to and from (c) will require some manner of 
description of the path syntax, which differs by OS.

4) In practice, the vast majority of file paths are portable between the 
various forms; the forms are "nearly" isomorphic, with corner cases 
being fairly rare.

5) Translations between the various forms cost compute cycles and 
memory, and are not necessarily bijective.  Therefore, translations 
should occur _only_ if absolutely necessary.  In particular, if a file 
name passes through a program as a black box (it is not examined or 
manipulated) it should undergo no transformation.

6) Different OSes handle file names differently.  These differences 
should be accounted for, transparently where possible.  These 
differences, however, should be exposed to developers for whom the 
difference matter.

7) Using simple file names should be easy.  We don't want developers to 
have to worry too much about character encodings, path separators, and 
generally bizarre path syntax just to open files.  The complexities of 
correct file name handling should be hidden from the casual programmer. 
However, developers interested in serious 
portability/internationalization should be able to get down into the 
muck if they need to.





More information about the Haskell-Cafe mailing list