[Haskell-cafe] Re: [Haskell] ANN: random-access-list-0.1

Fri Jun 13 07:15:18 EDT 2008

Stephan Friedrichs wrote:
> Isaac Dupree wrote:
>> [...]
>>
>> Great to see it, it deserved implementing, IIRC!  I don't remember 
>> enough about it.. (and don't have Okasaki anywhere handy). Can it be 
>> lazy or infinitely long?
> 
> No, it has to be finite as it's actually a list of complete binary trees 
> whose size depend on the skew binary representation of the list's size. 
> I'll document this.

okay then

>> [...]
>>
>> Is "RandomAccessList" the best name for something that's not O(1), 
>> just O(log n)?  Or just that "RandomAccessList" is just much longer 
>> than "[]"?
> 
> Well Chris Okasaki called them "Skew Binary Random-Access Lists", which 
> is even longer :)

:)

hmm.. IndexableList? (just a thought, not sure whether I like it any better)

>>
>> don't use those unorthodox infix function names.. `cons` is hardly 
>> worse than .:. , `append` or `mappend` than .+. , and .!. than, hmm.. 
>> . Export a ++ and ! (!! ?) if you're really dedicated. But I'd prefer 
>> an `at` that's the same partial indexing operation, rather than the 
>> name .!. (I think this "at" was a new name being put somewhere else? 
>> partly because "!" is trying to be gradually used only to refer to 
>> strictness?)
> 
> Good point!
> 
>>
>> "extractHead" is an ugly name for a nevertheless standardish-meaning 
>> function... what is it usually called? uncons? headTail? 
>> (Data.Sequence, which is meant to be left-right symmetric, calls it 
>> "viewr"... except your version doesn't have the Maybe, it's partial 
>> instead, fails on empty lists)
> 
> Yes, I wasn't happy with that one either. The view-concept of 
> Data.Sequence is a good idea.

yeah, it's a good idea, although I'm not sure how much I like the 
particular syntax of how it's done in Data.Sequence (the view-types' 
constructor names, mostly)

>>
>> For "index", don't use Monad, use Maybe (I think that's what the 
>> recent libraries at haskell.org discussion concluded, in the context of 
>> switching Data.Map back to Maybe).
> 
> I was just copying the idea from Data.Map and it's usually a good thing 
> to have functions as general as possible, or why is it not?

To summarize: Monad isn't the proper abstraction for failable/Maybe. 
Maybe is an algebraic data type that *exactly* represents the spirit of 
what you're trying to do: e.g. Conor McBride said: "Maybe is the most 
general abstraction. Requiring (>>=), or even (<*>) seems excessive. 
What we need is "any f with zero and return", so why not pick the 
canonical, initial, inductively defined such thing?"  In this case the 
typeclass adds no generality to the function's behaviour (Maybe can be 
trivially converted to any other type, with a combinator even).  And the 
confusion that results, when the function is almost always used at type 
Maybe anyway.  If you want to read the whole discussion... if you 
haven't been subscribed to libraries at haskell.org , it's archived:
http://thread.gmane.org/gmane.comp.lang.haskell.libraries/9082

>> Also, Data.List has genericLength etc, to 
> 
> At the moment, I'm using the Int type for size and indexing only for one 
> reason: I haven't found a proper way to generalize it. I'd really like 
> to use the Ix class, but it doesn't provide enough functionality, it 
> only works on fixed-size intervals (i. e. for arrays, which don't change 
> their size, but a list does). Maybe someone has an idea of how to 
> realize lists with a variable starting index and size?

fair enough.  If your implementation only supports sizes up to that of 
Int (which is reasonable for a strict finite type... whereas something 
like ([1..2^34] `genericIndex` (2^33)) can probably complete in a small 
amount of memory and only a moderate amount of time on a modern machine, 
even a 32-bit one, due to laziness and garbage collection)

>> support.  Isn't "index" (like Data.List.genericIndex) supposed to be a 
>> name for a partial operation, not one that returns a Maybe?  Shouldn't 
>> "size" be named "length" (or exported as both names, since e.g. 
>> Data.Map.size, .List.length) (why is it O(1) not O(log n)? Is it 
>> possible for these RandomAccessLists to be longer than maxBound::Int?)?
> 
> The size function is in O(1) because I cache it, otherwise it would be
> 
> size (RandomAccessList xs) = sum (map fst xs)
> 
> which is O(log n). I consider the caching useful, as most applications 
> will check 0 <= i < size quite often.

sounds good

> 
>>
>> for e.g. toList, is the O(n) cost spread over traversing/demanding the 
>> items of the generated list, or all up-front, or somewhere in between?
>>
>> Why is zip slow with unbalanced lists?  Obviously, it could be 
>> implemented O(min(m,n)*(log m + log n)) just indexing each one, which 
>> would be faster for really unbalanced-size lists...  Obviously, I don't 
> 
> If two lists have exactly the same size, all the complete binary trees 
> holding the data have the same size as well. This can be zipped directly 
> and is a bit (~5% in my tests) faster.

okay, that sounds like a fair optimization, since zipping same-size 
lists is a nice thing to do anyway.  But the asymptotic speed ideally 
should still be O(min(m,n)), if possible?

>> understand the implementation.  BTW, looking at the file,
>> data RandomAccessList a
>>         = RandomAccessList {-# UNPACK #-} !Int ![(Int, CBTree a)]
>> Marking a list strict like that doesn't have much effect because it 
>> only makes the first (:) or [] strict. Did you mean for an 
>> element-strict or spine-strict(which would necessarily be 
>> non-infinite) list?
> 
> Oh, you're right, I just meant to mark the Int strict. It obviously was 
> too late yesterday!
> 
> Thanks for this feedback!
> //Stephan
>