<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">I would like to propose adding `take`/`drop`/`splitAt` to both `Data.Map` and `Data.Set` as originally requested in:<div class=""><br class=""></div><div class=""><a href="https://github.com/haskell/containers/issues/135" class="">https://github.com/haskell/containers/issues/135</a></div><div class=""><br class=""></div><div class="">The motivation behind this proposal is three-fold:</div><div class=""><br class=""></div><div class="">* for convenience - these functions are commonly used to implement pagination or previews of maps/sets</div><div class="">* for type accuracy - the public API impose an unnecessary `Ord` constraint</div><div class="">* for efficiency - these can be implemented more efficiently using the internal API</div><div class=""><br class=""></div><div class="">Currently the only way you can implement this functionality via the public API is to use `lookupIndex`/`elemAt` + `split`. For example, one way to implement `Data.Set.take` is:<blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><br class=""></blockquote><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">take :: Ord a => Int -> Set a -> Set a</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">take n m</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> | n < 0 = empty</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> | size m <= n = m</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> | otherwise = lt</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> where</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> (lt, _) = split k m</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> k = elemAt n m</div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><div style="margin: 0px; line-height: normal;" class="">{-# INLINE take #-}</div></div></blockquote><div class=""><div class=""><br class=""></div><div class="">This implementation incurs an unnecessary `Ord` constraint due to a roundabout way of computing `take`: this extracts the element at the given index and then works backwards from the element’s value to partition the set using O(log N) comparisons. We could eliminate all of the comparisons by using the internal API.</div><div class=""><br class=""></div><div class="">Intuitively, we expect that the performance of `Data.Set.take` would benefit from avoiding those unnecessary comparisons and also avoiding traversing the `Set`’s spine twice. So I tested that hypothesis by implementing `take` via the internal API like this:</div></div></div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">take :: Int -> Set a -> Set a</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">take n0 s0 = go s0 n0</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> where</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> go s@(Bin sz x l r) n =</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> if sz <= n</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> then s</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> else</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> let sl = size l</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> in if n <= sl</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> then go l n</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> else link x l (go r $! n - sl)</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> go Tip _ = Tip</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">{-# INLINE take #-}</div></div></blockquote><div class=""><br class=""></div><div class="">I then added the following benchmark to `benchmarks/Set.hs`:</div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><b class="">diff --git a/benchmarks/Set.hs b/benchmarks/Set.hs</b></div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><b class="">index 3a6e8aa..03c99fb 100644</b></div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><b class="">--- a/benchmarks/Set.hs</b></div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><b class="">+++ b/benchmarks/Set.hs</b></div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(52, 187, 199);" class="">@@ -31,6 +31,7 @@<span style="font-variant-ligatures: no-common-ligatures; color: #000000" class=""> main = do</span></div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> , bench "union" $ whnf (S.union s_even) s_odd</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> , bench "difference" $ whnf (S.difference s) s_even</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> , bench "intersection" $ whnf (S.intersection s) s_even</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(52, 189, 38);" class="">+ , bench "take" $ whnf (S.take (2^11)) s</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> , bench "fromList" $ whnf S.fromList elems</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> , bench "fromList-desc" $ whnf S.fromList (reverse elems)</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> , bench "fromAscList" $ whnf S.fromAscList elems</div></div></blockquote><div class=""><br class=""></div><div class="">Here is the performance on my machine when implementing `take` via the public API:</div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">benchmarking take</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">time 272.8 ns (266.7 ns .. 278.1 ns)</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""> 0.997 R² (0.996 R² .. 0.998 R²)</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">mean 266.3 ns (261.8 ns .. 270.8 ns)</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">std dev 15.44 ns (13.26 ns .. 18.95 ns)</div></div><div class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class="">variance introduced by outliers: 75% (severely inflated)</div></div></blockquote><div class=""><br class=""></div><div class=""><div class="">… and the performance improved by 61% from using the internal API:</div></div><div class=""><br class=""></div><blockquote style="margin: 0 0 0 40px; border: none; padding: 0px;" class=""><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo;" class=""><div style="margin: 0px; line-height: normal;" class="">benchmarking take</div><div style="margin: 0px; line-height: normal;" class="">time 169.2 ns (166.1 ns .. 172.6 ns)</div><div style="margin: 0px; line-height: normal;" class=""> 0.997 R² (0.996 R² .. 0.998 R²)</div><div style="margin: 0px; line-height: normal;" class="">mean 172.1 ns (169.4 ns .. 175.4 ns)</div><div style="margin: 0px; line-height: normal;" class="">std dev 10.68 ns (8.420 ns .. 15.34 ns)</div><div style="margin: 0px; line-height: normal;" class="">variance introduced by outliers: 78% (severely inflated)</div></div></blockquote><div class=""><br class=""></div><div class="">… and I’m guessing (but haven’t tested) that the performance gap would only increase the more expensive the comparison function gets.</div><div class=""><br class=""></div><div class="">I haven’t performed comparative performance testing for `drop`/`splitAt` nor have I tested `Map` (because the benchmarks take a while for me to build and run) but I can perform those additional comparisons upon requests if people feel they are necessary.</div><div class=""><br class=""></div><div class="">I haven’t yet written up a full patch since the maintainer asked me to first run this proposal by the libraries mailing list to assess whether it would be wise to expand the `containers` API to include these utilities.</div><div class=""><br class=""></div><div class="">The deadline for discussion is two weeks.</div></body></html>