[Haskell-cafe] Re: Suggested additions to System.FilePath.Posix/Windows

Marcus D. Gabriel marcus at gabriel.name
Mon Sep 21 14:45:19 EDT 2009


Simon Marlow wrote:
> Brandon S. Allbery KF8NH wrote:
>> On Sep 19, 2009, at 07:45 , Duncan Coutts wrote:
>>> On Thu, 2009-09-17 at 11:58 +0200, Marcus D. Gabriel wrote:
>>>>> -- | 'reduceFilePath' returns a pathname that is reduced to canonical
>>>>> -- form equivalent to that of ksh(1), that is, symbolic link names
>>>>> are
>>>>> -- treated literally when finding the directory name.  See @cd -L@ of
>>>>> -- ksh(1).  Specifically, extraneous separators @(\"/\")@, dot
>>>>> -- @(\".\")@, and double-dot @(\"..\")@ directories are removed.
>>>
>>> So it's like the existing System.Directory.canonicalizePath but it's
>>> pure and it does not do anything with symlinks. On the other hand
>>> because it's pure it can do something with non-local paths.
>>>
>>> Is there anything POSIX-specific about this? I don't see it.
>>
>> It's making assumptions about the safety of eliding "..".  (What does
>> \\machine\share\..\ do?)  On the other hand that's also unsafe on
>> POSIX in the presence of symlinks.  In general I consider path
>> cleanup not involving validation against the filesystem to be risky.
>
> I agree; this came up before during the design of System.FilePath, and
> it's why the current library doesn't have a way to remove "..".  The
> docs should probably explain this point, because it's non-obvious that
> you can't just "clean up" a path to remove the ".." and end up with
> something that means the same thing.
>
> Cheers,
>     Simon

A few points to explain my point of view.  It's a little long.

Yes, reduceFilePath is pure and not an IO action, but this was not
important for my application.  What counted is the preservation of the
logical structure of the path which was a design choice and that the
paths in question may no longer exist in the file system.  Thus
canonicalizePath could not help me.  These are the most important
points.  For me, if these two points are not of generally interest, than
there should be no System.FilePath.reduceFilePath or equivalent.

The essential POSIX standard (IEEE Std 1003.1) can be found at

<http://www.opengroup.org/onlinepubs/009695399/utilities/cd.html>,

that is, cd - change the working directory.  The key points are in the
OPTIONS section and steps 8 and 9 of the DESCRIPTIONS section.  I used
ksh(1) as my guide, but bash(1) or dash(1) work also.  See also

<http://www.opengroup.org/onlinepubs/009695399/>,

that is, section 4.11 Pathname Resolution.

So, the function reduceFilePath does not make any assumptions about
eliding of "..", it simple attempts to implement the behaviour of cd -L
of a POSIX shell consistent with Path Resolution of section 4.11 minus
the dereferencing of symbolic links.  Whether reduceFilePath does this
correctly or not is another question.  (It does not, sorry about that,
but I have an older version that does minus the leading double slash rule.)

Although it is true that if you just clean up the path it may no longer
resolve to the same object in the file system as would the result of a
call to canonicalizePath, Python has a library function whose name I
cannot remember in which the documentation just states that this may
change the meaning of the path, that is, let the programmer beware.

In my case, I verified and resolved the initial inputs from the user so
that either an error message occurred or I could use reduceFilePath in
confidence during processing.  That is to say, the file system
validation was done upfront so that I could safely maintain the logical
structure which was the design choice and in certain cases continue
processing even if the pathnames no longer referred to anything.

This means that \\machine\share\..\ is \\machine\ logically.  Thus, the
application should either not use reduceFilePath or it should set up
conditions to avoid or catch this case.  The blind or unthinking use of
reduceFilePath is not only risky, it's a mistake.  Just like
unsafePerformIO, let the programmer beware.

If reduceFilePath is useless under Windows but at least makes some kind
of sense, then for me, a debugged, POSIX compliant
System.FilePath.reduceFilePath would have been nice.   So I propose it. 
If it makes absolutely no sense under Windows, then drop it so as to
maintain the interface which I used wherever I could and appreciated
greatly.

Cheers,
- Marcus




More information about the Haskell-Cafe mailing list