<html><head></head><body>YMMV, but based on some benchmarking I've done, the default implementation using `compareBy` seems to optimize better than trying to force dictionary unpacking. The only way I've been able to generically sort faster than base's `compareBy` is to use nonempty list structures for runs.—<br>Sent from my phone with K-9 Mail.<br><br><div class="gmail_quote">On 26 March 2023 19:07:03 UTC, Viktor Dukhovni <ietf-dane@dukhovni.org> wrote:<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<pre dir="auto" class="k9mail">On Sun, Mar 26, 2023 at 10:59:13AM -0700, Todd Wilson wrote:<br><br><blockquote class="gmail_quote" style="margin: 0pt 0pt 1ex 0.8ex; border-left: 1px solid #729fcf; padding-left: 1ex;">Is there an overhead penalty to be paid for using functions here, or<br>is it compiled away into Prolog-like tail calls that fill in the<br>missing parts of the data structure?<br></blockquote><br>Efficient implementations of functional langauges make extensive use of<br>tail call optimisation.<br><br>I don't know whether looking at the GHC-generated "Core" for your<br>implementation will answer any of your questions, but here it is:<br><br> Haskell:<br> runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs<br> where<br> run x [] = ([], [])<br> run x (y:ys) = if x <= y<br> then let (us, vs) = run y ys<br> in (y:us, vs)<br> else ([], y:ys)<br><br> Core (with some added comments):<br> -- Unpack the "Ord" dictionary to extract just the required "<="<br> -- function and call the "$wruns" worker ("ww3" is "<=", and "ds"<br> -- is the list to be transformed:<br> --<br> runs :: forall a. Ord a => [a] -> [[a]]<br> runs<br> = \ (@a) ($dOrd :: Ord a) (ds :: [a]) -><br> case $dOrd of { C:Ord ww ww1 ww2 ww3 ww4 ww5 ww6 ww7 -><br> $wruns ww3 ds<br> }<br><br> Rec {<br> $wruns :: forall {a}. (a -> a -> Bool) -> [a] -> [[a]]<br> $wruns<br> = \ (@a) (ww :: a -> a -> Bool) (ds :: [a]) -><br> case ds of {<br> [] -> []; -- runs [] = []<br> : x xs -> -- runs (x:xs) = let (ys, zs) = run x xs in (x:ys) : runs zs<br> let {<br> ds1 :: ([a], [a]) -- A lazily evaluated thunk for (run x xs)<br> ds1<br> = letrec {<br> -- Internal recursion in "run" returns strict unboxed pairs<br> -- (on the stack) avoiding heap or thunk allocation for the tuple.<br> $wrun :: a -> [a] -> (# [a], [a] #)<br> $wrun<br> = \ (x1 :: a) (ds2 :: [a]) -><br> case ds2 of wild1 { -- (y:ys) is "wild1"<br> [] -> (# [], [] #); -- run x [] = ([], [])<br> : y ys -><br> case ww x1 y of { -- x <= y ?<br> False -> (# [], wild1 #); -- else ([], y:ys)<br> True -> -- then let (us, vs) = run y ys in (y:us, vs)<br> let {<br> ds3 :: ([a], [a]) -- A "thunk" for (run y ys) evaluated lazily<br> ds3 = case $wrun y ys of { (# ww1, ww2 #) -> (ww1, ww2) } } in<br> (# : y (case ds3 of { (us, vs) -> us }),<br> case ds3 of { (us, vs) -> vs } #)<br> }<br> }; } in<br> case $wrun x xs of { (# ww1, ww2 #) -> (ww1, ww2) } } in<br> : (: x (case ds1 of { (ys, zs) -> ys }))<br> (case ds1 of { (ys, zs) -> $wruns ww zs })<br> }<br> end Rec }<br><br>So for a non-empty cons-cell (: x ...) the result is a new cons cell:<br><br> : (x : (fst ds1)) (runs (snd ds1))<br> --------- --------------<br><br>in which both underline parts are computed lazily (on demand) from the<br>thunk "ds1":<br><br> λ> head $ head $ runs $ 42 : undefined<br> 42<br><br>When we do want the successor of the first element, we look no futher<br>than necessary:<br><br> λ> head $ runs $ 42 : 0 : undefined<br> [42]<br><br> λ> take 2 $ head $ runs $ 42 : 43 : undefined<br> [42,43]<br><br>Does this help? FWIW, the (simplified) Core output was generated via:<br><br> hscore() {<br> ghc -ddump-simpl -dsuppress-idinfo -dsuppress-coercions -dsuppress-type-applications -dsuppress-uniques -dsuppress-module-prefixes "$@"<br> }<br> hscore -O2 Runs.hs > Runs.core<br><br></pre></blockquote></div></body></html>