[Haskell-cafe] RFC: demanding lazy instances of Data.Binary
duncan.coutts at worc.ox.ac.uk
Tue Nov 20 05:35:21 EST 2007
On Mon, 2007-11-19 at 20:06 -0600, Nicolas Frisby wrote:
> In light of this discussion, I think the "fully spine-strict list
> instance does more good than bad" argument is starting to sound like a
> premature optimization. Consequently, using a newtype to treat the
> necessarily lazy instances as special cases is an inappropriate
> My current opinion: If Data.Binary makes both a fully strict list
> instance (not ) and a fully lazy list instance (this would be the
> default for ) available, then that will also make available all of
> the other intermediate strictness. I'll elaborate that a bit. If the
> user defines a function appSpecificSplit :: MyDataType -> [StrictList
> a], then the user can control the compactness and laziness of the
> serialisation by tuning that splitting function. Niel's 255 schema
> fits as one particular case, the split255 :: [a] -> [StrictList a]
> function. I would hesitate to hard code a number of elements, since it
> certainly depends on the application and only exposing it as a
> parameter maximizes the reusability of the code.
Fully lazy is the wrong default here I think. But fully strict is also
not right. What would fit best with the style of the rest of the
Data.Binary library is to be lazy in a lumpy way. This can give
excellent performance where as being fully lazy cannot (because the
chunk size becomes far too small which increases the overhead).
Has anyone actually said they want the list serialisation to be fully
lazy? Is there a need for anything more than just not being fully
strict? If there is, I don't see it. If it really is needed it can be
added just by flushing after serialising each element.
> "Reaching for the sky" idea: Does the Put "monad" offer enough
> information for an instance to be able to recognize when it has filled
> a lazy bytestring's first chunk? It could cater its strictness ( i.e.
> vary how much of the spine is forced before any output is generated)
> in order to best line up with the chunks of lazy bytestring it is
> producing. This might be trying to fit too much into the interface.
> And it might even make Put an actual monad ;)
That is something I've considered. Serialise just as much of the list as
is necessary to fill the remainder of a chunk. Actually we'd always fill
just slightly more than a chunk because we don't know how big each list
element will be, we only know when we've gone over.
More information about the Haskell-Cafe