[Haskell-cafe] Patterns for processing large but finite streams
Heinrich Apfelmus
apfelmus at quantentunnel.de
Fri Jul 1 11:34:54 CEST 2011
Eugene Kirpichov wrote:
>>> Plain old lazy lists do not allow me to combine multiple concurrent
>>> computations, e.g. I cannot define average from sum and length.
>
> I meant the average of the whole list - given a sumS and lengthS ("S"
> for "Stream"), write meanS as something like liftS2 (/) sumS lengthS.
>
> Or is that possible with lazy lists too?
>
> (looks like arrows actually - which arrow is appropriate here?)
That's a very good point. Just to clarify for everyone: Eugene wants to
write the function average almost *literally* as
average xs = sum xs / length xs
but he wants the functions sum and length to fuse, so that the input
stream xs is *not* shared as a whole.
I have thought about this problem for a while actually and have observed
the following:
1) You are not looking for a representation of streams, but for a
representation of *functions* on streams. The essence of a function on
streams is its case analysis of the input. Hence, the simplest solution
is to make the case analysis explicit:
data StringTo a = CaseOf a (Char -> StringTo a)
-- function on a stream (here: String)
interpret :: StringTo a -> (String -> a)
interpret (CaseOf nil cons) [] = nil
interpret (CaseOf nil cons) (x:xs) = interpret (cons x) xs
instance Applicative StringTo where
pure a = CaseOf a (const $ pure a)
(CaseOf nil1 cons1) <*> (CaseOf nil2 cons2) =
CaseOf (nil1 $ nil2) (\c -> cons1 c <*> cons2 c)
length = go 0 where go n = CaseOf n (\_ -> go $! n+1)
average = liftA2 (/) sum length
In other words, if you reify case .. of expression , you will be able
to fuse them.
2) If Haskell were to support some kind of evaluation under the lambda
(partial evaluation, head normal form instead of weak head normal form),
it would be unnecessary to make the case expressions implicit. Rather,
the applicative instance could be written as follows
instance Applicative ((->) String) where
pure a = const a
f <*> x = \cs -> case cs of
[] -> f [] $ x []
(c:cs) ->
let f' cs = f (c:cs) -- partial evaluation on this
x' cs = x (c:cs)
in f' `partialseq` x' `partialseq` (f' <*> x') cs
We could simply write
average = liftA2 (/) sum length
and everything would magically fuse.
3) John Hughes has already thought about this problem in his PhD thesis.
:) (but it is not available for download on the internet, unfortunately.
:( ). His solution was a SYNCHLIST primitive in conjunction with some
sort of parallelism PAR. Basically, the SYNCHLIST primitive only allows
simultaneous access to the input stream and the parallelism is used to
make that simultaneity happen.
Best regards,
Heinrich Apfelmus
--
http://apfelmus.nfshost.com
More information about the Haskell-Cafe
mailing list