Data.List.sortBy creates unnecessary thunks
Siddhanathan Shanmugam
siddhanathan+eml at gmail.com
Sun Mar 26 07:10:54 UTC 2017
Data.List.sortBy uses a DList-like representation to efficiently chunk a
list, but the final conversion from DList to List is not strict. This
leaves unnecessary suspensions while sorting. This was first noticed in
https://github.com/ghc-proposals/ghc-proposals/pull/50, where sorting a
random list with a naive merge sort is faster than the one in base.
Proposing the following change, which should improve performance and space
usage.
```diff
diff --git a/libraries/base/Data/OldList.hs b/libraries/base/Data/OldList.hs
index c3618c42a9..a68c4f42af 100644
--- a/libraries/base/Data/OldList.hs
+++ b/libraries/base/Data/OldList.hs
@@ -851,7 +851,7 @@ sortBy cmp = mergeAll . sequences
ascending a as (b:bs)
| a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
- ascending a as bs = as [a]: sequences bs
+ ascending a as bs = *as [a] `seq`* as [a] : sequences bs
mergeAll [x] = x
mergeAll xs = mergeAll (mergePairs xs)
```
And, as a side note, can we please add more type signatures and comments to
the sort/sortBy function?
Cheers,
Sid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20170326/eeddc5d3/attachment.html>
More information about the Libraries
mailing list