[Haskell-cafe] Re: [darcs-users] Iteratees, streams, and mmap
dagit at codersbase.com
Sat Dec 12 19:07:51 EST 2009
On Sat, Dec 12, 2009 at 1:37 PM, Johann Höchtl <johann.hoechtl at gmail.com>wrote:
> On Dec 12, 7:13 pm, Jason Dagit <da... at codersbase.com> wrote:
> > Hi Heinrich,
> > On Sat, Dec 12, 2009 at 2:42 AM, Heinrich Apfelmus <
> > apfel... at quantentunnel.de> wrote:
> > > Jason Dagit wrote:
> > > > My next experiment will be to find ways to express "take this
> > > and
> > > > apply it to a stream without letting the stream leak". One
> implication is
> > > > that gzReadFilePS should not be used outside of a core set of modules
> > > which
> > > > have been auideted to be resource concious. Another implication is
> > > we
> > > > need to be really careful about wether or not we allow returning of
> > > > sequences of patches. Possibly, we need several foldl-like functions
> > > that
> > > > open the stream internally. For example, to process the pending
> maybe we
> > > > should have:
> > > > withPending :: (a -> Patch -> a) -> IO a
> > > > And withPending would start the streaming and make sure that the
> > > > cannot be visible as a data dependency outside of withPending.
> > > Just a small comment on a potential flaw in this scheme and the
> > > observation that even the rank-2 type trick from the ST s monad
> > > wouldn't help.
> > I would say it does help, but it doesn't make it perfect.
> > > Namely, withPending does not guarantee that the stream does not leak,
> > > it only makes it more natural/convenient to formulate one's code so
> > > it doesn't leak. In particular, using (:) as argument pretty much
> > > defeats the whole purpose:
> > Right. And the iteratee library points out that your iteratees have to
> > well-behaved (I think there they say "bounded"). I'm well aware of this
> > issue and thanks for pointing it out for others who are reading along.
> > > withPending (flip (:))
> > > Fortunately, the type system can ensure that the patches don't leak.
> > > withPending :: (forall s. a -> Patch s -> a) -> IO a
> > > Now, a may not mention s and the type checker will reject flip (:)
> > > as argument. See also
> > > Oleg Kiselyov, Chung-chieh Shan.
> > > Lightweight Monadic Regions.
> > > http://www.cs.rutgers.edu/~ccshan/capability/region-io.pdf<http://www.cs.rutgers.edu/%7Eccshan/capability/region-io.pdf>
> > > for an elaboration of this technique.
> > I'm still on the fence as to whether this style of writing it will add
> > greater than the complexity it brings. I am certainly considering it :)
> > The darcs source does other things that are also fairly complex.
> > > However, the line between leaking and not leaking is very thin here. As
> > > soon as we are given for example a function
> > > name :: Patch s -> String
> > > that discards the s , its results can "leak", in the sense that we
> > > could now build a list of names
> > > foo :: IO [String]
> > > foo = withPending . flip $ (:) . name
> > > Even worse, any type a that doesn't have O(1) space usage will "leak"
> > > bar :: IO [()]
> > > bar = withPending . flip $ const (() :)
> > > In other words, exporting only a foldl' -like interface does not
> > > prevent us from writing functions that have O(n) instead of O(1) space
> > > usage. But trying to rectify that with the forall s trick is a doomed
> > > idea, too.
> > I realize it's not perfect, but the problem we have now is that it's too
> > easy to write things that have dismal space usage. If we can't force
> > space usage, how can we make it more natural to have bounded space? Or
> > least a good approximation.
> > It seems that:
> > * foldl'-style helps
> > * rank-n can help
> > * no approach I've seen *forces* the behavior we want
> > * existing code and bug reports demonstrate we need to improve the
> > situation
> > I'm open to suggestions on how to ensure the code has the space behavior
> > want. Lazy IO* and streams of patches is more compositional and natural
> > Haskell programmers, but it seems that it's too hard to ensure the code
> > reasonable space usage. At least where the darcs source is concerned.
> > Therefore, I think the status quo demonstrates that in the darcs source
> > worth experimenting with alternatives to lazy io and streams. In other
> > words, the human effort to make the code behave how we want is currently
> > high and that's the issue I want to address. I don't know how we could
> > it impossible to have space leaks, although that would be interesting.
> As a beginner to Haskell, I am only 1/3 through RWH, those lines scare
> me in a sense to question my effort. I simply can not distinguish if
> this discussion is somewhat pathological in a sense that every access
> to the outside world imposes dangers and an additional exception
> handler here and there and an additional if-statement to handle error
> return codes will suffice.
We're not talking about exception handling :) And yes, Heinrich is talking
about pathological cases.
> Or lazy evaluation, IO monads and the whole story behind
> unsafePerformIO was an additional layer of self-deception and
> unpredictable effects from the outside world and lazy evaluation can
> NEVER be satisfactory handled.
We're not talking about unsafePerformIO either.
The discussion at hand is much simpler. Given a large stream of data, how
should you style your code so that it's easy to reason about the space
usage? This is an important question in every programming language I've
ever used. Also, IO need not enter the discussion except that in darcs the
streams of data come from the disk. I think moving the discussion to
haskell-cafe was a mistake. I said what I said in a very specific context
(that of the darcs developers mailing list), which is likely no longer
Sorry for any distress this has caused you.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe