High-level technique for program options handling

Tomasz Zielonka t.zielonka at students.mimuw.edu.pl
Tue Jan 20 10:41:39 EST 2004


On Mon, Jan 19, 2004 at 02:17:42PM +0000, Alastair Reid wrote:
> On Sunday 18 January 2004 15:42, Tomasz Zielonka wrote:
> > [much explanation of his option processing approach elided]
> 
> Interesting technique - lots of cool ideas there.

Thanks :-)

> I too find getOpts to be a great base but have taken a different
> approach when writing console-mode Unix programs.  Part of my
> approach is implemented (download
> http://www.cs.utah.edu/flux/knit/cmi.html for GPL'ed program and look
> in cmi/src/utils/FluxUtils/Prog.hs) and part is still in my head
> waiting for an excuse to go cleanup the code.

> Some of these tricks can be merged with Tomasz's technique (e.g.,
> replace a call to writeFile with a call to writeOutput) while others
> are orthogonal (e.g., Tomasz deals with arguments one at a time
> whereas some of my tricks look for duplicated arguments or omitted
> arguments).

Both techniques - the traditional approach with sum Flag datatype and my
approach with product Options datatype - can be seen as being dual to
themselves, as you can implement one of top of the other.

Moreover, there is an easy migration path from traditional option
processing to my technique - you just build a list of Flag values as one
of Options fields. This way you can migrate your options one at a time.

    data Flag = Verbose
	      | Version
	      -- | Input (Maybe String) -- this one moved to Options field
	      | Output String
	deriving Show

    data Options = Options
	{ optFlags  :: [Flag] -> [Flag]
	, optInput  :: IO String
	}

    startOptions :: Options
    startOptions = Options
	{ optFlags  = id
	, optInput  = getContents
	}

    options :: [OptDescr (Options -> IO Options)]
    options =
	[ Option "h" ["help"]
	    (NoArg (\opt -> exitHelp))
	    "Show usage info"

	, Option "i" ["input"]
	    (ReqArg
		(\arg opt -> return opt { optInput = readInput arg })
		"FILE")
	    "Input file, - for stdin"

	, Option "o" ["output"]
	    (ReqArg
		(appendFlag . Output)
		"FILE")
	    "Output file, - for stdout"

	, Option "v" ["verbose"]
	    (NoArg
		(appendFlag Verbose))
	    "Be verbose"

	, Option "V" ["version"]
	    (NoArg
		(appendFlag Version))
	    "Print version"
	]
      where
	appendFlag :: Flag -> (Options -> IO Options)
	appendFlag flag opts =
	    return opts { optFlags = (flag :) . (optFlags opts) }

    main = do
	(opts, _) <- parseOptions

	let flags = optFlags opts []

	...

I also thought about nesting one Options datatype in another datatype.
This could be done with a function with type:

    (a -> b) ->
    (a -> b -> a) ->
    [OptDescr (b -> IO b)] ->
	[OptDescr (a -> IO a)]

It would allow to divide options in groups.

> I would be very interested in comments on this code, the Unix idioms they
> implement, Unix idioms omitted, applicability to Windows, MacOS, improving 
> error messages, etc.

I have a question about error reporting. You use 'error' quite often. I
think that this can cause errors to pop up at strange moments during
program evaluation. It this a real problem? I prefer reporting errors
early in the IO monad. I think there is some trade-off involved, but I
can't name it now.

> Here's a list of common Unix idioms and how I implement them:
> 
> 1) Interpreting the filename "-" as stdin or stdout
> 
>    I use this function (and a similar function for reading
>    input). 

>   [Trivial detail: I use pretty printing for all I/O in my programs.]

Interesting approach, you keep compositionality of Doc up to the last
moment. It certainly helps to make program's output look prettier.

>    -- |
>    -- Write to file ("-" means stdout)
>    writeOutput :: FilePath -> Doc -> IO ()
>    writeOutput "-" output = do
>      printDoc PageMode stdout output
>    writeOutput outfile output = do
>      h <- openFile outfile WriteMode
>      printDoc PageMode h output
>      hClose h
>    
> 2) Treating arguments of the form 'VARNAME=VALUE' like 
>    environment variables (cf. GNU make)
> and
> 3) Printing usage info for malformed command lines

Shouldn't be too difficult to adapt to my technique.

I've got to make some name for it :)

> 4) An option can be specified 0 or 1 times:
> 
>   Filter options using this function
> 
>    -- | 
>    -- Extract value from a list of length at most one.
>    uniqueWithDefault :: String -> a -> [a] -> a
>    uniqueWithDefault what d []  = d
>    uniqueWithDefault what d [a] = a
>    uniqueWithDefault what d _   = error $ "At most one " ++ what ++ " may be 
> specified"

I used to ignore superfluous options, but I agree that reporting this
would be nicer.

In my approach you can still use your technique, if you build a list of
values for an option.

    data Options = Options
	{ optOutfile :: [String]
	, ...
	}

    startOpts :: Options
    startOpts = Options
	{ optOutfile = []
	, ...
	}

    options :: [ OptDescr (Options -> IO Options) ]
    options =
       [ Option "o" ["output"]
           (ReqArg
               (\arg opt -> return opt { optOutfile = arg : optOutfile out })
               "FILE")
           "Output file"
       , 
	 ...

You could also use some other Monoid, for example one that explicitly
discourages more than one value:

    data T a = Zero | One a | Many

    instance Monoid (T a) where
	mempty = Zero
	mappend Zero	x    = x
	mappend (One a) Zero = One a
	mappend _	_    = Many

BTW. What would be a good name for T?

> 5) An option must be specified exactly once:

Same as above.

I would love to write more, but I have to go to work (where I mostly
struggle with C++ :( but occasionally manage to smuggle Haskell :))

Best regards,
Tomasz

-- 
.signature: Too many levels of symbolic links


More information about the Haskell mailing list