[Haskell-cafe] RE: ANN: System.FilePath 0.9

Andrew Pimlott andrew at pimlott.net
Wed Jul 26 14:32:59 EDT 2006


On Wed, Jul 26, 2006 at 03:36:13AM +0100, Neil Mitchell wrote:
> >> Its a rats nest to do it properly, but some very basic idea of "does
> >> this path have things which there is no way could possibly be in a
> >> file" - for example c:\|file is a useful thing to have.
> >
> >This seems to encourage the classic mistake of checking "not known bad"
> >rather than "known good".  "known bad" is rarely useful in my
> >experience.  What use case do you have in mind?
> 
> wget on windows saves web pages such as
> "http://www.google.com/index.html?q=haskell" to the file
> "index.html?q=haskell". This just doesn't work, and is the main reason
> I added this in. I don't think it will be a commonly used operation.

Ok, this is a good use case.  What should wget do if "isValid" fails?
Certainly not abort the download.  So isValid alone is no help.  Well,
you have makeValid as well, but this is even more of a rats' nest.
There are a zillion different ways you might want this function to work,
depending on your purposes.  Should makeValid be system-dependent?
Should it be reversible?  How hard should it try to preserve the name
verbatim?  Should it "prettify" legal but unprintable characters?  The
answers are application-dependent.  Also, wget has to worry about not
just whether the filename is valid, but whether it currently exists, and
in that case modify it.  So attempting to provide a generic makeValid is
quixotic and will only lead to misuse.

> >My criticism is that your properties are all specified in terms of
> >string manipulation.  The whole point of paths is that they are
> >interpreted by the system, so if you neglect to say what your operations
> >mean to the system, what have you specified?
> True, but at the same time specifying what something means with
> respect to a filesystem is very hard :) If you had any insight how
> this could be done I'd be interested.

The first step is to think carefully about what operations to provide,
and be conservative.  I think the operations I included in my library
all have pretty clear meanings, though I don't claim to have nailed them
down all the way.  Criticism welcome.

http://haskell.org/pipermail/libraries/2006-February/004890.html

> Hopefully no one will ever use it. Its part of the low level functions
> that the FilePath module builds on. However, pragmatically, someone
> somewhere will have a use for it, and the second they do they'll just
> write '/', and at that point we've lost.

Yes, on one hand you want to be pragmatic.  But IMO this way of
thinking--expose the guts just in case--is the path to madness.  Not to
mention, it clutters the API and makes it less clear how the module is
supposed to be used.  Maybe the "guts" could go into a separate module?

> >> splitFileName :: FilePath -> (String, String)
> >> Split a filename into directory and file.
> >Which directory and which file?
> Ok, thats probably the wrong description. Splits off the last filename
> would be a better description, leaving the rest.

Ok, but now what is "the rest" good for?  And what is the "last
filename" in cases like "/" or "..".  The conclusion I come to is that
this operation is unsound to begin with, and should not be part of the
API in any form.

> >Also, it looks from this that you treat paths differently depending on
> >whether they end in a separator.  Yet this makes no difference to the
> >system.  That seems wrong to me.
> That was something I thought over quite a while. If the user enters
> "directory/" then they do not mean the file called directory, they
> mean the directory called directory. And in Windows certainly you
> can't open a file called "file/"

Ok, fair, but "dir" and "dir/" are treated identically if dir is a
directory, so it is still confusing for your library to distinguish
them.  Maybe the user needs to indicate whether a path represents a file
or directory?  These matters confuse your specification.  I made the
simplifying approximation that "foo" and "foo/" should considered
equivalent.  This may not turn out to be the right decision, but at
least it helped me keep the semantics clear.

> >> getDirectory :: FilePath -> FilePath
> >> Get the directory name, move up one level.
> >What does this mean, in the presence of dots and symlinks?
> It gets a parent directory, there may be one, but the one returned
> will be a parent.

Is "/a" a parent of "/a/.."?  That seems dubious.

> >> equalFilePath :: FilePath -> FilePath -> Bool
> >> Equality of two FilePaths. If you call fullPath first this has a much
> >> better chance of working. Note that this doesn't follow symlinks or
> >> DOSNAM~1s.
> >As you acknowledge, it's a crap-shoot.  So what's the point?
> Its a case of reality, at the moment people use == to test if two file
> paths are equal, at least this is a better test.

Why is it better?

> >I think of that as a separate module, because extensions have no meaning
> >to the system and can be done with portable, functional code, as far as
> >I understand.
> Not really, what about getExtension "file.ext\lump" - the answer is ""
> on windows and ".ext\lump" on Posix.

You would only call the extension functions on a segment name.

Andrew


More information about the Haskell-Cafe mailing list