[Git][ghc/ghc][wip/three-way-merge-sort] Add three way merge to Data.List.sort(By)
Jade (@Jade)
gitlab at gitlab.haskell.org
Mon Jan 8 14:49:43 UTC 2024
Jade pushed to branch wip/three-way-merge-sort at Glasgow Haskell Compiler / GHC
Commits:
832cc041 by Jade at 2024-01-08T15:42:05+01:00
Add three way merge to Data.List.sort(By)
This patch aims to implement an improved version of Data.List.sortBy and therefore Data.List.sort which increases performance by up to 20% on larger lists, but also ~10% on lists of size 100 and smaller.
This approach implements a three-way merge which also aims to reuse comparisons.
Fixes #24280
- - - - -
1 changed file:
- libraries/base/src/Data/OldList.hs
Changes:
=====================================
libraries/base/src/Data/OldList.hs
=====================================
@@ -1636,37 +1636,69 @@ and possibly to bear similarities to a 1982 paper by Richard O'Keefe:
Benchmarks show it to be often 2x the speed of the previous implementation.
Fixes ticket https://gitlab.haskell.org/ghc/ghc/issues/2143
+
+Further improved using a three-way merge, with an additional performance increase of ~20%
+https://gitlab.haskell.org/ghc/ghc/issues/24280
-}
sort = sortBy compare
sortBy cmp = mergeAll . sequences
where
+ x `gt` y = x `cmp` y == GT
+
sequences (a:b:xs)
- | a `cmp` b == GT = descending b [a] xs
- | otherwise = ascending b (a:) xs
+ | a `gt` b = descending b [a] xs
+ | otherwise = ascending b (a:) xs
sequences xs = [xs]
descending a as (b:bs)
- | a `cmp` b == GT = descending b (a:as) bs
- descending a as bs = (a:as): sequences bs
+ | a `gt` b = descending b (a:as) bs
+ descending a as bs = (a:as): sequences bs
ascending a as (b:bs)
- | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs
- ascending a as bs = let !x = as [a]
- in x : sequences bs
+ | not (a `gt` b) = ascending b (\ys -> as (a:ys)) bs
+ ascending a as bs = let !x = as [a]
+ in x : sequences bs
mergeAll [x] = x
mergeAll xs = mergeAll (mergePairs xs)
- mergePairs (a:b:xs) = let !x = merge a b
- in x : mergePairs xs
- mergePairs xs = xs
+ mergePairs [a, b, c] = [merge' a b c]
+ mergePairs (a:b:c:xs) = let !x = merge' a b c
+ in x : mergePairs xs
+ mergePairs [a,b] = [merge a b]
+ mergePairs xs = xs
+
merge as@(a:as') bs@(b:bs')
- | a `cmp` b == GT = b:merge as bs'
- | otherwise = a:merge as' bs
- merge [] bs = bs
- merge as [] = as
+ | a `gt` b = b : merge as bs'
+ | otherwise = a : merge as' bs
+ merge [] bs = bs
+ merge as [] = as
+
+ merge' as@(a:as') bs@(b:bs') cs@(c:cs')
+ | a_gt_b, b_gt_c = c : merge'gt as (b:|bs') cs' -- a > b > c
+ | a_gt_b = b : merge' as bs' cs -- a > b <= c
+ | a_gt_c = c : merge'le (a:|as') bs cs' -- c < a <= b
+ | otherwise = a : merge' as' bs cs -- c >= a <= b
+ where a_gt_b = a `gt` b
+ a_gt_c = a `gt` c
+ b_gt_c = b `gt` c
+ merge' [] bs cs = merge bs cs
+ merge' as [] cs = merge as cs
+ merge' as bs [] = merge as bs
+
+ merge'gt as bs@(b:|bs') cs@(c:cs')
+ | b_gt_c = c : merge'gt as bs cs' -- a > b > c
+ | otherwise = b : merge' as bs' cs -- a > b <= c
+ where b_gt_c = b `gt` c
+ merge'gt as (b:|bs) [] = b : merge as bs
+
+ merge'le as@(a:|as') bs cs@(c:cs')
+ | a_gt_c = c : merge'le as bs cs' -- c < a <= b
+ | otherwise = a : merge' as' bs cs -- c >= a <= b
+ where a_gt_c = a `gt` c
+ merge'le (a:|as) bs [] = a : merge as bs
{-
sortBy cmp l = mergesort cmp l
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/832cc041b599ea80d7fa4c1e48e170c0a0e5536d
--
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/832cc041b599ea80d7fa4c1e48e170c0a0e5536d
You're receiving this email because of your account on gitlab.haskell.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240108/205ef917/attachment-0001.html>
More information about the ghc-commits
mailing list