Filename handling
Graham Klyne
GK at ninebynine.org
Tue Aug 17 10:49:02 EDT 2004
At 14:04 17/08/04 +0100, Simon Marlow wrote:
>On 17 August 2004 12:44, Graham Klyne wrote:
>
> > Anyway, I'd like to see the common library functions provide at least
> > minimal capabilities to allow multi-platform applications to do the
> > right
> > thing when handling filenames. I'm pretty agnostic about what they
> > actually look like, but as an example I've found the Python and/or
> > Java
> > libraries to be pretty usable in this respect.
>
>I think there's general agreement that this would be a good thing, but
>discussion never seems to reach a conclusion. Anyone like to whip up a
>concrete proposal?
This may be a bit radical, but I'll float it anyway:
pathToUri :: String -> String
-- convert filename to a file: URI according to local system conventions
uriToPath :: String -> String
-- convert a file: URI to filename according to local system conventions
Hmmm... to preserve referential transparency, I suppose that should be:
pathToUri :: String -> IO String
uriToPath :: String -> IO String
The rationale here is that these two functions can be used to get any
filename on any system into a form with well-defined syntax and properties
and back again, allowing the other filename processing requirements
(splitting apart, putting together, relative path evaluations, etc.) to be
performed with the common form.
Of course, this doesn't deal with operations that need to actually access
the file system (directory scanning, etc.), but many of these seem pretty
well catered for in any case (cf. Directory library functions).
...
Failing this, I'd say that Isaac's module [1] has some pretty reasonable
functions. I'd pick out:
splitLastComp :: FilePath -> (FilePath,FilePath)
isAbsolute :: FilePath -> Bool
splitExt :: FilePath -> (FilePath, String)
The next function would be useful, but I'd be reluctant to include it until
we're confident of having consistent regex support on all platforms:
matchPath :: String -- ^RegExp
-> IO [FilePath] -- ^IO because it must look to see what exists
An alternative, avoiding regex dependence, might be:
matchPath :: (FilePath -> Bool) -> IO [FilePath]
And a very important (IMO) function that I don't see in Isaac's module
would be something like:
relativeTo :: FilePath -> FilePath -> FilePath
In my URI processing code, I've also added a complementary function:
relativeFrom :: FilePath -> FilePath -> FilePath
which returns a relative path such that:
(path `relativeFrom` base) `relativeTo` base == path
noting that the result relativeFrom is not always uniquely
determined. Maybe it's better to leave this out.
I think that a function like:
isDirectory :: FilePath -> IO Bool
may also be needed when performing directory scanning operations.
[1] http://www.syntaxpolice.org/darcs_repos/OS.Path/Path.hs
...
Some related questions to consider:
- should we take seriously the point I make above about using IO so that
referential transparency is rigorously preserved? If so, all of the above
functions should return IO values, as the result may vary depending on the
environment in which the program runs.
- do we care about legacy operating systems like VAX/VMS? (that would
require version number support, and doesn't work well with interfaces that
assume a single path separator character).
- how does the interface work with forthcoming systems like Microsoft's
Longhorn. I hear that the directory tree concept is being replaced by file
"attributes". Which leads me to think of...
- how does the interface work with WebDAV, which builds a file system like
interface over HTTP, and adds property lists to the resources identified.
#g
------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact
More information about the Libraries
mailing list