[Haskell-cafe] showing a user defined type

Ryan Ingram ryani.spam at gmail.com
Thu May 21 01:14:06 EDT 2009


The "unsafe" is that it's possible to write weird_chain_take; nothing
is wrong with chain_take, the unsafeness is that the data structure
admits things that aren't really chains based on passing odd arguments
to "f".

That said, my "Stream" definition wastes time constructing thunks;
building closures isn't free.  The Haskell stream fusion library
solves this problem like this:

data Step s a = Yield a s | Skip s | Done

data Stream a = forall s. MkStream !s (s -> Step s a)
-- "s" is hidden by the forall, the type of the constructor:
-- MkStream :: forall a s. s -> (s -> Step s a) -> Stream a


Now, the function inside the stream is packaged with an existential
"state" type which is opaque from outside the stream:

stream_view :: Stream a -> Maybe (a, Stream a)
stream_view (MkStream st next) = case next st of
    Done -> Nothing
    Skip new_st -> stream_view (MkStream new_st next)
    Yield a new_st -> Just (a, MkStream new_st next)

Notice that you no longer have to construct function closures: the
"advance" function stays the same:

ints_stream :: Int -> Stream Int
ints_stream n0 = Stream n0 (\n -> Yield n (n+1))

The state type in ints_stream is still Int, but there's no way to make
use of that information to "leak" the state representation.  Pretty
cool! :)

A side discussion: why Skip?  Consider implementing filter on streams
without Skip:

broken_filter_stream :: (a -> Bool) -> Stream a -> Stream a
broken_filter_stream p (MkStream st next) = MkStream st go where
    go st = case next st of
        Done -> Done
        Yield a new_st -> if (p a) then (Yield a new_st) else (go new_st)

But this version is recursive.  It turns out if you can make all your
stream functions non-recursive, the optimizer can do some crazy
awesome things with them that it can't do on recursive versions.  With
Skip, it's easy to solve this problem:

filter_stream :: (a -> Bool) -> Stream a -> Stream a
filter_stream p (MkStream st next) = MkStream st go where
    go st = case next st of
        Done -> Done
        Skip new_st -> Skip new_st
        Yield a new_st -> if (p a) then (Yield a new_st) else (Skip new_st)

  -- ryan

On Wed, May 20, 2009 at 6:34 PM, michael rice <nowgate at yahoo.com> wrote:
> Thanks for the extra patience.
>
> What does "unsafe" mean? From my perspective, you had a perfectly good
> chain_take function
>
> fun chain_take(0,_) = nil
> |   chain_take(n,Link(i,f)) = i :: chain_take(n-1,f(i))
>
> - chain_take(5,always1(6));
> val it = [1,1,1,1,1] : int list
>
> Putting aside weird_chain_take for now, what, if anything, is unsafe about
> chain_take?
>
> Michael
>
> --- On Wed, 5/20/09, Ryan Ingram <ryani.spam at gmail.com> wrote:
>
> From: Ryan Ingram <ryani.spam at gmail.com>
> Subject: Re: [Haskell-cafe] showing a user defined type
> To: "michael rice" <nowgate at yahoo.com>
> Cc: "Brandon S. Allbery KF8NH" <allbery at ece.cmu.edu>,
> haskell-cafe at haskell.org
> Date: Wednesday, May 20, 2009, 2:21 PM
>
> Actually, I was saying that "chain" already emulates laziness, just in
> a somewhat unsafe way, as demonstrated by weird_take.  In Haskell
> you'd probably just write
>
>> ints :: Int -> [Int]
>> ints n = [n..]
>
> and be done with it.
>
>   -- ryan
>
> On Wed, May 20, 2009 at 9:41 AM, michael rice <nowgate at yahoo.com> wrote:
>> OK, I think I understand. You were explaining how ML could be made to
>> emulate Haskell laziness using streams, ala Scheme type delayed
>> evaluation,
>> so it's kind of like you were explaining a question I hadn't quite asked
>> yet, which maybe explains my puzzlement, I hope.
>>
>> Also, though my understanding of both Haskell and ML syntax is still
>> rudimentary, I did catch an error in your definition of chain_take: the
>> first arg of the cons should be i rather than n.
>>
>> I'm still going through your code and may have further questions.
>>
>> Thanks for your input.
>>
>> Michael
>>
>>
>> --- On Wed, 5/20/09, Ryan Ingram <ryani.spam at gmail.com> wrote:
>>
>> From: Ryan Ingram <ryani.spam at gmail.com>
>> Subject: Re: [Haskell-cafe] showing a user defined type
>> To: "michael rice" <nowgate at yahoo.com>
>> Cc: "Brandon S. Allbery KF8NH" <allbery at ece.cmu.edu>,
>> haskell-cafe at haskell.org
>> Date: Wednesday, May 20, 2009, 4:12 AM
>>
>> (Apologies for my mutilation of ML syntax, I don't completely know the
>> language)
>>
>> Consider the ML type int list, and this function to build one:
>>
>> broken_repeat :: int -> int list
>> broken_repeat n = Cons(n, broken_repeat(n))
>>
>> This function is recursive, and doesn't terminate; it tries to build
>> an infinite list of ints and your computer runs out of heap and/or
>> stack trying to evaluate it.
>>
>> But the "chain" type doesn't have this problem; you can see it as an
>> int list that gets evaluated "on demand":
>>
>> repeat :: int -> chain
>> repeat(n) = Link(n, repeat)
>>
>> always1 :: int -> chain
>> always1(_) = Link(1, always1)
>>
>> chain_take :: int * chain -> int list
>> chain_take (0,_) = Nil
>> chain_take (n,Link(i,f)) = Cons(n, chain_take(n-1, f(i)))
>>
>> But, nothing in the "chain" type stops you from passing a different
>> value to the function in the link:
>>
>> weird_take :: int * int * chain -> int list
>> weird_take (0,_,_) = Nil
>> weird_take (n,v,Link(i,f)) = Cons(i, weird_take(n-1,v,f(v)))
>>
>> Now, it's possible that chain_take returns the same list for two
>> different "chain" inputs, but weird_take might return different lists
>> depending on how f is implemented.  For example:
>>     chain_take(5, repeat(1)) = [1,1,1,1,1]
>>     chain_take(5, always1(1) = [1,1,1,1,1]
>>
>>     weird_take(5, 2, repeat(1)) = [1,2,2,2,2]
>>     weird_take(5, 2, always1(1)) = [1,1,1,1,1]
>>
>> One way to fix this is to embed the "state" of the chain in the closure
>> itself.
>>
>> So, in ML, the type
>>    unit -> X
>> is commonly called a "thunk"; it can be used to delay computation
>> until later, until it's demanded, just like any lazy value in Haskell.
>>
>> f :: unit -> Int
>> f () = 1
>>
>> This f isn't very useful; it's basically the same as "1".  But
>> consider this type:
>>
>> datatype stream = Stream of (int * (unit -> stream))
>>
>> stream1 () = Stream(1, stream1)
>>
>> stream_take :: int*stream -> int list
>> stream_take(0,_) = Nil
>> stream_take(n,Stream(i,f)) = Cons(i, stream_take(n-1, f()))
>>
>> Now there is no way to pass a different value like we did in
>> weird_take; there's only ().
>>
>> The difference is that the state gets embedded in the closure for the
>> thunk:
>>
>> stream_ints :: int -> (unit -> stream)
>> stream_ints = fun n => fun () => Stream(n, stream_ints(n+1))
>>
>> What you are doing here is encoding laziness; the Haskell version of this
>> type:
>>
>>> data Stream = Stream !Int Stream -- !Int means the Int value is strict
>>> stream1 = Stream 1 stream1
>>> stream_ints n = Stream n (stream_ints(n+1))
>>
>>> stream_take :: Int -> Stream -> [Int]
>>> stream_take 0 _ = []
>>> stream_take n (Stream x xs) = x : stream_take (n-1) xs
>>
>> No extra (\() -> ...) thunk is required, due to laziness :)
>>
>>   -- ryan
>>
>> On Tue, May 19, 2009 at 4:25 PM, michael rice <nowgate at yahoo.com> wrote:
>>> Hi Ryan,
>>>
>>> I'm afraid you've lost me. Maybe if you showed how this would be used in
>>> ML
>>> I would get the picture.
>>>
>>> Michael
>>>
>>> --- On Tue, 5/19/09, Ryan Ingram <ryani.spam at gmail.com> wrote:
>>>
>>> From: Ryan Ingram <ryani.spam at gmail.com>
>>> Subject: Re: [Haskell-cafe] showing a user defined type
>>> To: "michael rice" <nowgate at yahoo.com>
>>> Cc: "Brandon S. Allbery KF8NH" <allbery at ece.cmu.edu>,
>>> haskell-cafe at haskell.org
>>> Date: Tuesday, May 19, 2009, 2:40 PM
>>>
>>> On Tue, May 19, 2009 at 7:07 AM, michael rice <nowgate at yahoo.com> wrote:
>>>> A little further along in "The Little MLer" the ints function is
>>>> replaced
>>>> by
>>>> other functions like primes and fibs, which also return Links:
>>>>
>>>> fun primes(n)
>>>>   = if is_prime(n+1)
>>>>  then Link(n+1,primes)
>>>>  else primes(n+1)
>>>>
>>>> fun fibs(n)(m)
>>>>   = Link(n+m,fibs(m))
>>>>
>>>> which are passed to chain_item:
>>>>
>>>> fun chain_item(n,Link(i,f))
>>>>   = if eq_int(n,1)
>>>>   then i
>>>>   else chain_item(n-1,f(i))
>>>>
>>>> which can be called to request the nth (12th) prime number beginning at
>>>> 1.
>>>>
>>>> - chain_item(12,primes(1));
>>>> GC #0.0.0.1.3.61:   (1 ms)
>>>> val it = 37 : int
>>>> -
>>>>
>>>> So I guess the answer to your question about whether the function is
>>>> ever
>>>> called with a different value may be, yes.
>>>
>>> Actually, it's not calling it with another value; notice that
>>> chain_item calls f(i), with i coming directly from the chain.
>>> Consider this alternate definition:
>>> (I'm not sure the syntax is exactly right, but you get the idea)
>>>
>>> datatype chain =
>>>   Link of (int * ( unit -> chain ))
>>>
>>> fun intsFrom(n) = fun unit => (n, intsFrom (n+1))
>>> fun ints(n) = intsFrom n ()
>>>
>>> Now you *can't* call the function embedded in the link with another
>>> value.
>>>
>>> fun chain_item(n,Link(i,f))
>>>   = if eq_int(n,1)
>>>   then i
>>>   else chain_item(n-1,f unit)
>>>
>>> And this type for "chain" is almost the same as [Int] in Haskell, due
>>> to laziness.
>>>
>>>   -- ryan
>>>
>>>
>>
>>
>
>


More information about the Haskell-Cafe mailing list