Make lines stricter to fix a space leak

Daniel Fischer daniel.is.fischer at web.de
Mon Sep 27 09:12:49 EDT 2010


On Monday 27 September 2010 10:54:56, Christian Maeder wrote:
> I wonder if a generic version costs performance?
>
>  lines = breaksBy (== "\n")
>   (or "linesBy" or "splitBy")
>
> Cheers Christian

Gut feeling said it shouldn't and benchmarking supports that.

So, is that a generic enough operation to add it to the Data.List API?
If yes, the best name has to be found.
On the one hand, splitBy or breaksBy sound nicer than linesBy, because 
generically, it has nothing to do with lines. On the other hand, neither 
break nor split[At] remove the separators while lines does.

Also, there's linesBy in Data.List.Split 
(http://hackage.haskell.org/packages/archive/split/0.1.2.1/doc/html/Data-
List-Split.html) which does exactly that.

But Data.List.Split.linesBy is faster (for reasonably short lines). 
However, it dies a horrible death (Stack space overflow: current size 
67108864 bytes.) for very long lines and is then much slower if you give it 
enough stack to complete.

So I would say, put the generic version into Data.List as linesBy.
I think that deserves its own proposal.


More information about the Libraries mailing list