announcing darcs
Alastair Reid
alastair@reid-consulting-uk.ltd.uk
Thu, 10 Apr 2003 11:15:41 +0100
Alastair Reid <alastair@reid-consulting-uk.ltd.uk> writes:
>> [moved from cafe to libraries]
>>
>> For example, on a Unix system, /usr/lib/libcurl.so would be treated
>> something like this:
>>
>> (Just ["/","usr","lib"], "libcurl", Just "so")
Ketil Z Malde <ketil@ii.uib.no> writes:
> Isn't this a SMOP, writing functions:
>
> dirname :: FilePath -> String -- or FilePath?
> basename :: FilePath -> String
> suffix :: FilePath -> String
SMOP == small matter of programming?
Yes, it's pretty easy to do. But that small matter of programming
gets repeated time and time again (with many shortcuts taken which
limit portability or make incorrect assumptions about what are legal
filenames) so I suggest that a high quality library we added.
I'm sure your functions weren't intended as a final, polished API
(though they look like the GNU make filename API which, since it is
now set in concrete, is as final and polished as it is ever likely to
get) but I'll point out some of the issues in the set of functions you
suggest.
1) What should the functions return when there is no dirname, no
basename or no suffix. An empty string suggests itself but can we
then still distinguish between filenames like "foo." and "foo",
"/foo" and "foo"?
This is why I used 'Maybe' - though maybe I didn't use it enough in
my sketch?
2) It's often enough to split the dirname from the basename as you
suggest but I sometimes find myself needing to access a
subdirectory or parent directory. So I write code like:
dirname f ++ "/" ++ subdirname ++ "/" ++ notdir f
or the cryptic
reverse (takeWhile (/= '/') (reverse (dirname f))) ++ notdir f
Both are fixed if there's a way to split the dirname into a list
of directories so that we can add or remove bits at will.
3) We need a way to glue the various components back together again to
eliminate those non-portable uses of '++ "/" ++' above.
The obvious thing is to abstract the directory separator (typically
'/' or '\') but then you have to be careful when adding or removing
components from filenames that are relative or absolute, have or
lack a dirname, have or lack a suffix, etc.
I forget all the details of Windows filenames but you may also need
to be careful when dealing with Windows drive letters and SMB mounted
files on Windows.
This is, in part, why I suggesting that there be a way to parse
FilePaths into a richer structure. My thought was that as well as
having operations to access the components, there would also be
operations to modify the components (cf. record updates) - the idea
being that if you want to change the suffix, you don't have to
figure out all the things you want to remain constant, you just
have to figure out the things you want to change.
(The other reason for suggesting what the internal structure would
be comes from my background in algebraic specification. Given a
structure which is semantically equivalent to a tuple (as I believe
filenames ought to be viewed), we can just say it is equivalent to
a tuple (a model-based specification) or we can give a set of
equations in the algebraic specification style. My experience is
that, in this case, the model-based style scales better (i.e., is
shorter) and is easier to understand (because it exploits existing
understanding/intuition).)
--
Alastair Reid alastair@reid-consulting-uk.ltd.uk
Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/