[Haskell-cafe] How lazy can "peek" be?

Thu Aug 25 02:56:32 EDT 2005

   Hi! One of the nice things of laziness is that it allows us to 
express things concisely. For instance:

func a b =
    let ta = transmogrify a in
    case b of
        B1 -> doSomething
        B2 -> doSomethingElse ta
        B3 -> doAnotherThing ta

   It is clear why having this let-expression is a good thing. We can 
"transmogrify a" only once, and then use the "transmogrified a" only 
when (if) needed. In this case, one branch of the case expression 
doesn't need it, in which case we've effectively paid nothing for it, no 
matter how complex and time/resource-consuming "transmogrify" is.

   So far, so good.

   Now, I need to interoperate with external code using FFI. I have a 
function that gets called from outside, and gets its parameters as a 
pointer to a structure, so:

func p = do
    a <- peekByteOff p 0
    b <- peekByteOff p 4
    case b of
        B1 -> doSomething
        B2 -> doSomethingElse a
        B3 -> doAnotherThing a

   Now I see myself with a little dilemma. This code looks a lot like 
the first version, where instead of "transmogrifying", we just read the 
value from the structure through the use of the pointer. Reading the 
value from memory can't be very time consuming, but it has to cost 
something. If I don't want to pay for that cost, then I have to convert 
the code as in:

func p =
    b <- peekByteOff p 4
    case b of
        B1 -> doSomething
        B2 -> do { a <- peekByteOff p 0; doSomethingElse a }
        B3 -> do { a <- peekByteOff p 0; doAnotherThing a }

   Repetition, repetition, repetition. This does get tedious. Especially 
when the case expression has many branches, and there are many variables 
in the structure that might or might not be needed. So you see my 
dilemma. It's _very_ tempting to do something like:

func p = do
    let a = unsafePerformIO $ peekByteOff p 0
    b <- peekByteOff p 4
    case b of
        B1 -> doSomething
        B2 -> doSomethingElse a
        B3 -> doAnotherThing a

   I mean... the contents of "p" are never going to be modified, so this 
"feels" right on some level. But we're effectively breaking the pureness 
of the language. I mean... this function could be called twice with the 
same _pointer_ but different data stores in the structure, which can 
(and maybe will) cause problems depending on how evaluation proceeds.

   Is there a way to handle this nicely? I recently read about the Clean 
language. It seems like it allows (and relies) in an extension to the 
type system, that allows the program to specify uniqueness of values, so 
that two pointers values might be represented using the same bits, but 
they'd still be considered distinct values. But no such thing in 
Haskell. Maybe a good thing... trying to understand the whole 
explanation of uniqueness in Clean made my head spin. But still...

   Is it known what GHC, for instance, will do to this code when 
optimizing? Might it just do the right thing? Probably the C optimizer 
would take care of it. It feels to me that's my only hope.

   But it'd be great if the "haskell" part of the compiler could take 
care of it natively. This is something that I've been thinking about. I 
mean... the IO monad does seem to impose too much sometimes. Any IO 
action always is assumed to modify the "external world" (affect the 
execution of the actions that come after it), whether it does or not:

do
    peekByteOff p 0 -- totally useless, but it needs to be performed 
anyway before "returning"
    return ()

do
    a <- peekByteOff p 0
    b <- peekByteOff p 4 -- could be done before "a"

   Somehow, it feels like it'd be a good thing to be able to limit the 
scope of IO actions. Going too far would complicate it enormously:

do
    a <- readFile "file"
    b <- peekByteOff p 4 -- could be done in any order, really, but how 
would you express it... maybe "domains of influence", like "filesystem" 
and "memory"... not very clear.

do
    a <- peekByteOff p1 0
    pokeByteOff p2 0 1234 -- could be done in any order, as long as p1 
and p2 don't alias each other (yuck!)

   So I wouldn't want to go that far... yet :). A global thing might be 
an improvement. I'm aware it wouldn't be a monad anymore, but could it 
be something else?

   I hope this is all pointless :-).

JCAB