[Haskell-cafe] RE: ANN: System.FilePath 0.9

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Wed Jul 26 11:34:50 EDT 2006


On Wed, 2006-07-26 at 15:29 +0200, Udo Stenzel wrote:

> > My criticism is that your properties are all specified in terms of
> > string manipulation.
> 
> Exactly.  I believe, a FilePath should be an algebraic datatype.
> Most operations on that don't have to be specified, because they are
> simple and have an obvious effect.  Add a system specific parser and a
> system specific renderer, maybe also define a canonical format, and the
> headaches stop.  What's wrong with this?

We've had this discussion before. The main problem is that all the
current IO functions (readFile, etc) use the FilePath type, which is
just a String. So a new path ADT is fine if at the same time we provide
a new IO library. That of course is an ongoing discussion in itself.

So until we have the opportunity to change the FilePath type there does
seem to be value in providing a library that takes some of the
complexity and portability nightmares out of using the existing FilePath
type.

Currently, real programs are doing even less principled hacking with
strings. So an easy to use library that we can use now will be a great
improvement even if it's not perfect.

> data FilePath = Absolute RelFilePath | Relative RelFilePath
> data RelFilePath = ThisDirectory 
>                  | File String
>                  | ParentOf RelFilePath 
>                  | String :|: RelFilePath
> 
> parseSystemPath :: String -> Maybe FilePath
> renderSystemPath :: FilePath -> String
> 
> We can even clearly distiguish between the name of a directory in its
> parent and the directory itself.  On Windows, the root directory just
> contains the drive letters and is read-only,
> drive-absolute-but-directory-relative paths are simply ignored (they are
> a dumb idea anyway).  Seperator characters are never exposed, all we
> need now is a mapping from Unicode to whatever the system wants.  

That's another portability headache - file name string encodings.
Windows and OSX use encodings of Unicode. Unix uses strings of bytes.
They are not fully inter-convertible. On Unix the traditional technique
is to keep a system file name in the original encoding and convert to
Unicode to display to the user, but the Unicode version is never
converted back to a system file name because it doesn't necessarily
convert back to the same sequence of bytes.

My point is it's not quite as simple as "just making an ADT".

> > (Consider that "dir c:" lists the current directory on c:, not c:\)
> 
> I'd rather ignore that altogether.  Multiple roots with associated
> "current directories" are just a needless headache.  Even a "current
> directory" is somewhat ill-fitted for a functional language like
> Haskell.

Much of the time it can be ignored. Sometimes programs have to deal with
silly issues like this just because that is what the OS does and so you
might get such a corner case as input and be expected to deal with it.
(Though I admit this is a particularly obscure case.)

So in my humble opinion the current discussion on the issues of
semantics, names, IO or pure etc is worthwhile.

Duncan



More information about the Haskell-Cafe mailing list