[Haskell-cafe] A Monad for on-demand file generation?
Derek Elkins
derek.a.elkins at gmail.com
Mon Jun 30 08:08:31 EDT 2008
On Mon, 2008-06-30 at 12:04 +0200, Joachim Breitner wrote:
> Hi,
>
> for an application such as a image gallery generator, that works on a
> bunch of input files (that are assumed to be constant during one run of
> the program) and generates or updates a bunch of output files, I often
> had the problem of manually tracking what input files a certain output
> file depends on, to check the timestamps if it is necessary to re-create
> the file.
>
> I thought a while how to do this with a monad that does the bookkeeping
> for me. Assuming it’s called ODIO (On demand IO), I’d like a piece of
> code like this:
>
> do file1 <- readFileOD "someInput"
> file2 <- readFileOD "someOtherInput"
> writeFileOD "someOutput" (someComplexFunction file1 file2)
>
> only actually read "someInput" and "someOtherInput", do the calculation
> and write the output if these have newer time stamps than the output.
>
> The problem I stumbled over was that considering the type of >>=
> (>>=): Monad m => m a -> (a -> m b) -> m b
> means that I can not „look ahead“ what files would be written without
> actually reading the requested file. Of course this is not always
> possible, although I expect this code to be the exception:
>
> do file1 <- readFileOD "someInput"
> file2 <- readFileOD "someOtherInput"
> let filename = decideFileNamenameBasedOn file2
> writeFileOD filename (someComplexFunction file1 file2)
>
> But assuming that the input does not change during one run of the
> program, it should be safe to use "unsafeInterleaveIO" to only open and
> read the input when used. Then, the readFileOD could put the timestamp
> of the read file in a Monad-local state and the writeFileOD could, if
> the output is newer then all inputs listed in the state, skip the
> writing and thus the unsafeInterleaveIO’ed file reads are skipped as
> well, if they were not required for deciding the flow of the program.
>
> One nice thing is that the implementation of (>>) knows that files read
> in the first action will not affect files written in the second, so in
> contrast to MonadState, we can forget about them, which I hope leads to
> quite good guesses as to what files are relevant for a certain
> writeFileOD operation. Also, a function
> cacheResultOD :: (Read a, Show a) => FilePath -> a -> ODIO a
> can be used to write an (expensive) intermediate result, such as the
> extracted exif information from a file, to disk, so that it can be used
> without actually re-reading the large image file.
>
> Is that a sane idea?
>
> I’m also considering to use this example for a talk about monads at the
> GPN¹ next weekend.
You may want to look at Magnus Carlsson's "Monads for Incremental
Computing" http://citeseer.comp.nus.edu.sg/619122.html
More information about the Haskell-Cafe
mailing list