Proposal for Data.List.splitBy
Marcus D. Gabriel
marcus at gabriel.name
Sun Jan 18 09:11:22 EST 2009
Duncan Coutts wrote:
> On Sun, 2009-01-18 at 12:02 +0100, Marcus D. Gabriel wrote:
>> Brent Yorgey wrote:
>>>> P2. There should be no information loss, that is, keep the
>>>> keep the separators, keep the parts of the original list xs that
>>>> a predicate p, do not lose information about the beginning and the
>>>> of the list relative to the first and last elements of the list
>>>> respectively. The user of the function decides what to discard.
>>>> P3. A split list should be unsplittable so as to recover the original
>>>> list xs. (I made up the word unsplittable.) (P2 implies P3, but let us
>>>> state this anyway.)
>>> I'm not sure I agree with this.
>> Thanks for stating this. Dropping P3 would change my
>> thinking about this topic, that is, if we drop P3, then
>> I would prefer that no splitter functions are added to
>> Data.List and that it is left as is.
>>> The problem is that much (most?) of
>>> the time, people looking for a split function want to discard
>>> delimiters; for example, if you have a string like "foo;bar;baz" and
>>> you want to split it into ["foo","bar","baz"].
>> I agree with this comment when thinking about strings and what
>> I would do most of the time and from a pragmatic point of view.
> Indeed, the existing Data.List.words is certainly lossyand deliberately
> so. It's also useful and widely used.
> On the other hand it is a widely held view that Data.List.lines should
> not be lossy, ie that Data.List.unlines . Data.List.lines should be the
> identity. In the current implementation of lines . unlines it is not the
> case because of the way it handles a trailing newline.
An argument for not placing any fundamental splitter functions
in Data.List that are lossy if I ever read one.
The user of these functions should explicitly choose to lose
information. Then the documentation in the Haskell 98 report
might have stated instead something like
unlines . lines == id iff xs ends with '\n'
which would at least be up front.
More information about the Libraries