[ghc-steering-committee] Prelimary GHC extensions stats

Eric Seidel eric at seidel.io
Mon Nov 16 14:33:56 UTC 2020


Thanks for the analysis!

We have an extremely long tail of extensions with 0% Proliferation. This is somewhat tangential to the GHC20XX process, but I'm curious how many of these extensions actually have no uses as opposed to being rounded to zero. There might be some opportunities to remove unused extensions that just add to GHC's maintenance burden.

On Sun, Nov 15, 2020, at 12:19, Joachim Breitner wrote:
> Dear Committee,
> 
> in hopefully anticipation that we can start the GHC2021 process soon,
> and also given that the Haskell Survey closes today (so we get that
> data also soon, I guess), I ran a simple analysis against hackage
> today. I’ll share the data in a proper way when it's time; this is just
> a preview into what I have in mind.
> 
> ## Methodology
> 
> The methodology is:
>  
>  * For each package on hackage, get the latest version
>  * For each such package, try to read out the extensions used
>    in the cabal file and in the modules
>    (This uses https://github.com/kowainik/extensions)
>  * Aggregate these numbers.
> 
> 
> Out of 12729 packages fetched (4GB), 12025 could be parsed by the
> extensions library (if anyone cares about the missing ones, please
> submit fixes to hat library).
> 
> Only 1116 specify extensions in the cabal file using default-
> extensions; the rest uses per-file extensions exclusively. This is a
> relatively low number, so barring bugs in the analysis code, this means
> that most developers prefer per-module extensions (I am one of those).
> 
> I extracted three metrics, all represented as percentages:
> 
>  * Proliferation:
>    #(packages using that extensions)/#(packages parsed)
> 
>    (I’ll reserve the word “Popularity” for the data from the poll.)
> 
>  * Innocuousness
>    #(packages enable it in .cabal)/#(packages using default-extensions)
> 
>    A high number here indicates that many developers want this on by
>    default in their project, across all modules.
> 
>  * Aloofness
>    #(packages using this, but _only_ in modules, despite using 
> default-extensions)
>       /#(packages using default-extensions)
> 
>    A high number here indicates that, although a developer was in
>    principle happy with putting extensions into the default-extensions
>    field, they did not include this particular one. I take this as an
>    indication that this extension does _not_ make a great on-by-default
>    extension. (Matching my expectation, TemplateHaskell and CPP make
>    the highest here)
> 
> Yes, I went overboard with finding fun names to describe the metrics.
> Happy to take suggestions for more suitable names, maybe from Richard,
> our master of sophisticated English vocabulary.
> 
> 
> ## Source
> 
> I am using this code:
> 
> https://github.com/nomeata/ghc-proposals-stats/tree/master/ext-stats
> 
> The are proof that I am very much capable of writing horrible Haskell
> code that looks like bad python code.
> 
> If someone (including anyone
> outside the committee) feels like contributing better code, talk to me!
> Besides just prettier code, the tool could understand and be smart
> about implications between extensions, be clever about extensions that
> are part of the default-language already, and maybe provide nicer
> reporting.
> 
> ## Results
> 
> Eventually I’ll provide a CSV file with all relevant data (hackage
> stats and poll results), and maybe even upload it on some suitable web
> service that allows you to explore such a table easily (suggestions?).
>  
> For now, a preview of the complete output, sorted by Proliferation:
> 
>   Proliferation
>   |   Innocousness
>   |   |   | Aloofness
>   |   |   |
>  0%  0%  0% NoMonoLocalBinds
>  0%  0%  0% NoRebindableSyntax
>  0%  0%  0% NoMagicHash
>  0%  1%  0% RelaxedPolyRec
>  0%  3%  0% DoAndIfThenElse
>  0%  4%  0% ForeignFunctionInterface
>  0% 12%  0% PatternGuards
>  0% 14%  0% EmptyDataDecls
>  0%  0%  0% ParallelArrays
>  0%  0%  0% NoForeignFunctionInterface
>  0%  0%  0% NoDatatypeContexts
>  0%  0%  0% TransformListComp
>  0%  0%  0% InterruptibleFFI
>  0%  1%  0% MonadFailDesugaring
>  0%  0%  0% NullaryTypeClasses
>  0%  0%  0% UnboxedSums
>  0%  0%  0% CApiFFI
>  0%  0%  0% StaticPointers
>  0%  0%  0% DerivingStrategies
>  0%  1%  0% NamedWildCards
>  0%  0%  0% GHCForeignImportPrim
>  0%  0%  0% TemplateHaskellQuotes
>  0%  0%  0% GADTSyntax
>  0%  0%  0% JavaScriptFFI
>  0%  0%  0% PostfixOperators
>  0%  0%  0% DeriveLift
>  0%  0%  0% NondecreasingIndentation
>  0%  0%  0% Strict
>  0%  1%  0% NumDecimals
>  0%  0%  0% AutoDeriveTypeable
>  0%  1%  1% StrictData
>  0%  0%  0% ConstrainedClassMethods
>  0%  1%  0% DisambiguateRecordFields
>  0%  2%  0% NegativeLiterals
>  0%  2%  0% MonadComprehensions
>  0%  0%  0% UnliftedFFITypes
>  0%  2%  0% OverloadedLabels
>  0%  2%  0% BinaryLiterals
>  0%  2%  0% ApplicativeDo
>  0%  1%  0% MonoLocalBinds
>  0%  0%  1% ExplicitNamespaces
>  0%  1%  1% TypeFamilyDependencies
>  0%  0%  1% UndecidableSuperClasses
>  0%  0%  1% RoleAnnotations
>  1%  0%  1% ExplicitForAll
>  1%  0%  1% ExtendedDefaultRules
>  1%  2%  0% RebindableSyntax
>  1%  3%  0% EmptyCase
>  1%  3%  0% PartialTypeSignatures
>  1%  3%  1% DuplicateRecordFields
>  1%  2%  1% TypeInType
>  1%  2%  1% ImpredicativeTypes
>  1%  2%  0% RecursiveDo
>  1%  0%  2% ImplicitParams
>  1%  0%  2% OverloadedLists
>  1%  0%  1% IncoherentInstances
>  1% 11%  0% LiberalTypeSynonyms
>  1%  2%  2% AllowAmbiguousTypes
>  1%  1%  3% DeriveAnyClass
>  1% 10%  0% ParallelListComp
>  2%  2%  1% PackageImports
>  2%  6%  1% InstanceSigs
>  2% 11%  0% Arrows
>  2%  6%  0% UnicodeSyntax
>  2%  4%  2% PatternSynonyms
>  2%  6%  3% TypeApplications
>  2%  0%  2% OverlappingInstances
>  3%  9%  1% UnboxedTuples
>  3% 15%  1% MultiWayIf
>  3%  6%  3% NamedFieldPuns
>  4% 16%  2% DeriveFoldable
>  4%  7%  3% PolyKinds
>  4% 16%  2% DeriveTraversable
>  4% 10%  3% MagicHash
>  4% 13%  3% NoMonomorphismRestriction
>  4% 18%  4% DefaultSignatures
>  6%  8%  5% ViewPatterns
>  6% 10%  4% KindSignatures
>  6% 15%  6% QuasiQuotes
>  6%  3%  6% ExistentialQuantification
>  7% 30%  2% NoImplicitPrelude
>  7% 25%  6% ConstraintKinds
>  8% 23%  6% DeriveFunctor
>  8% 22%  6% FunctionalDependencies
>  9% 24%  6% StandaloneDeriving
>  9% 24%  6% TupleSections
> 10% 25%  6% DataKinds
> 10%  6%  8% TypeSynonymInstances
> 11% 28%  4% LambdaCase
> 11% 24%  7% GADTs
> 12% 27%  5% TypeOperators
> 12%  6% 15% UndecidableInstances
> 12% 20%  7% BangPatterns
> 14% 27% 10% DeriveGeneric
> 15% 27%  8% RecordWildCards
> 17% 21% 16% TemplateHaskell
> 17% 29% 14% GeneralizedNewtypeDeriving
> 19% 29% 12% RankNTypes
> 20% 26%  8% DeriveDataTypeable
> 20% 31% 11% TypeFamilies
> 21% 34% 11% MultiParamTypeClasses
> 22% 11% 18% CPP
> 25% 37% 13% ScopedTypeVariables
> 26% 42% 14% FlexibleContexts
> 31% 43% 16% FlexibleInstances
> 34% 53% 10% OverloadedStrings
> 
> Enjoy!
> Joachim
> 
> -- 
> Joachim Breitner
>   mail at joachim-breitner.de
>   http://www.joachim-breitner.de/
> 
> 
> _______________________________________________
> ghc-steering-committee mailing list
> ghc-steering-committee at haskell.org
> https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
>


More information about the ghc-steering-committee mailing list