Proposal for Data.List.splitBy

Marcus D. Gabriel marcus at
Sun Jan 18 09:11:22 EST 2009

Duncan Coutts wrote:
> On Sun, 2009-01-18 at 12:02 +0100, Marcus D. Gabriel wrote:
>> Brent Yorgey wrote:
>>>> P2. There should be no information loss, that is, keep the
>> delimiters,
>>>> keep the separators, keep the parts of the original list xs that
>> satisfy
>>>> a predicate p, do not lose information about the beginning and the
>> end
>>>> of the list relative to the first and last elements of the list
>>>> respectively. The user of the function decides what to discard.
>>>> P3. A split list should be unsplittable so as to recover the original
>>>> list xs. (I made up the word unsplittable.) (P2 implies P3, but let us
>>>> state this anyway.)
>>> I'm not sure I agree with this.
>> Thanks for stating this.  Dropping P3 would change my
>> thinking about this topic, that is, if we drop P3, then
>> I would prefer that no splitter functions are added to
>> Data.List and that it is left as is.
>>> The problem is that much (most?) of
>>> the time, people looking for a split function want to discard
>>> delimiters; for example, if you have a string like "foo;bar;baz" and
>>> you want to split it into ["foo","bar","baz"].
>> I agree with this comment when thinking about strings and what
>> I would do most of the time and from a pragmatic point of view.
> Indeed, the existing Data.List.words is certainly lossyand deliberately
> so. It's also useful and widely used.
> On the other hand it is a widely held view that Data.List.lines should
> not be lossy, ie that Data.List.unlines . Data.List.lines  should be the
> identity. In the current implementation of lines . unlines it is not the
> case because of the way it handles a trailing newline.
> Duncan

An argument for not placing any fundamental splitter functions
in Data.List that are lossy if I ever read one.

The user of these functions should explicitly choose to lose
information.  Then the documentation in the Haskell 98 report
might have stated instead something like

  unlines . lines == id iff xs ends with '\n'

which would at least be up front.

- Marcus

More information about the Libraries mailing list