Proposal for Data.List.splitBy

Sun Jan 18 08:46:23 EST 2009

On Sun, 2009-01-18 at 12:02 +0100, Marcus D. Gabriel wrote:
> Brent Yorgey wrote:
> >> P2. There should be no information loss, that is, keep the
> delimiters,
> >> keep the separators, keep the parts of the original list xs that
> satisfy
> >> a predicate p, do not lose information about the beginning and the
> end
> >> of the list relative to the first and last elements of the list
> >> respectively. The user of the function decides what to discard.
> >>
> >> P3. A split list should be unsplittable so as to recover the original
> >> list xs. (I made up the word unsplittable.) (P2 implies P3, but let us
> >> state this anyway.)
> >
> > I'm not sure I agree with this.
> 
> Thanks for stating this.  Dropping P3 would change my
> thinking about this topic, that is, if we drop P3, then
> I would prefer that no splitter functions are added to
> Data.List and that it is left as is.
> 
> > The problem is that much (most?) of
> > the time, people looking for a split function want to discard
> > delimiters; for example, if you have a string like "foo;bar;baz" and
> > you want to split it into ["foo","bar","baz"].
> 
> I agree with this comment when thinking about strings and what
> I would do most of the time and from a pragmatic point of view.

Indeed, the existing Data.List.words is certainly lossy and deliberately
so. It's also useful and widely used.

On the other hand it is a widely held view that Data.List.lines should
not be lossy, ie that Data.List.unlines . Data.List.lines  should be the
identity. In the current implementation of lines . unlines it is not the
case because of the way it handles a trailing newline.

Duncan