[Haskell-cafe] To [] Or Not To []

Tikhon Jelvis tikhon at jelv.is
Mon Mar 13 18:00:46 UTC 2017


Personally, I disagree with the whole premise. Lists are simple and
elegant; you *should* use them most of them time. I'm not saying you should
use lists as the data structure for an in-memory database or whatever, but
that's the point—most applications only have a handful of data structures
that "matter", and lists are great everywhere else.

I actually went through and replaced [] with Vector in all of the types we
parse from JSON at work, some of which get relatively large. It made the
code uglier and didn't meaningfully affect the performance. I undid that
change, even though it's exactly the sort of thing this article recommends.
In this day and age, simple things *scale*. That's enough most of the time;
if you can get away with it you *should*.

The real advantage of lists comes at an intersection of two points: lists
are effective in place of iterators in Haskell and, even misused as data
structures, they're *not that bad* most of the time. This means that a good
80% of the time, the advantage of using a type that's compatible with the
rest of my code and APIs that use lists "correctly" as iterators easily
outweighs any small performance penalty. A list has to get pretty large—or
my usage pattern pretty convoluted—before another type is worth the
complexity.

On Mon, Mar 13, 2017 at 3:43 AM, Johannes Waldmann <
johannes.waldmann at htwk-leipzig.de> wrote:

> Hi Olaf -
>
>
> > I'd be interested whether there is a way to check
> > which of my lists in the source code the compiler managed to "deforest"
> away.
> > Which intermediate files should I look at? What are the tools to inspect?
>
> Good question, and I don't know an easy answer.
>
> The general advice is running ghc with "-ddump-simpl"
> but I find it quite challenging to scan the output.
>
>
> Here is a simple case where it works:
>
> $ cat Fuse.hs
>
> main = print $ sum $ map (^2) [1 .. 1000 :: Int]
>
>
> $ ghc -fforce-recomp -ddump-simpl -O0 Fuse.hs
>
> no optimisation - shows that "main" calls "map" etc.
>
>
> $ ghc -fforce-recomp -ddump-simpl -O2 Fuse.hs
>
> fusion works - nice non-allocating inner loop
> (lists gone, and Int replaced by Int#)
>
> Rec {
> -- RHS size: {terms: 18, types: 3, coercions: 0}
> Main.$wgo [InlPrag=[0], Occ=LoopBreaker]
>   :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
> [GblId, Arity=2, Caf=NoCafRefs, Str=DmdType <S,1*U><S,U>]
> Main.$wgo =
>   \ (w_s5kV :: GHC.Prim.Int#) (ww_s5kZ :: GHC.Prim.Int#) ->
>     case w_s5kV of wild_Xn {
>       __DEFAULT ->
>         Main.$wgo
>           (GHC.Prim.+# wild_Xn 1#)
>           (GHC.Prim.+# ww_s5kZ (GHC.Prim.*# wild_Xn wild_Xn));
>       1000# -> GHC.Prim.+# ww_s5kZ 1000000#
>     }
> end Rec }
>
> but the challenge is to find the path from "main" to that,
> wading through several other functions that may or may not be related.
>
>
> I can imagine a source annotation like "in the code compiled
> from this function  f,  that constructor  C  should never be called"
> but this is certainly not easy. Do we really mean "never", or do we mean
> "only a bounded number of times" (that is, not in the inner loop).
> Perhaps there is no code for  f  itself, because it gets inlined.
>
> But yes, *some* automated analysis (and human-readable print-out)
> of the code after simplification would be nice.
>
> This could be done as a compiler plug-in?
> https://downloads.haskell.org/~ghc/latest/docs/html/users_
> guide/extending_ghc.html#compiler-plugins
>
>
> - J.
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20170313/d0526308/attachment.html>


More information about the Haskell-Cafe mailing list