[Git][ghc/ghc][wip/az/ghc-cpp] 42 commits: CorePrep: Name `sat` binders more descriptively

Alan Zimmerman (@alanz) gitlab at gitlab.haskell.org
Thu Feb 6 23:25:47 UTC 2025



Alan Zimmerman pushed to branch wip/az/ghc-cpp at Glasgow Haskell Compiler / GHC


Commits:
7cc08550 by Ben Gamari at 2025-02-04T18:34:49-05:00
CorePrep: Name `sat` binders more descriptively

- - - - -
fb40981d by Ben Gamari at 2025-02-04T18:35:26-05:00
ghc-toolchain: Parse i686 triples

This is a moniker used for later 32-bit x86 implementations
(Pentium Pro and later).

Fixes #25691.

- - - - -
02794411 by Cheng Shao at 2025-02-04T18:36:03-05:00
compiler: remove unused assembleOneBCO function

This patch removes the unused assembleOneBCO function from the
bytecode assembler.

- - - - -
db19c8a9 by Matthew Pickering at 2025-02-05T23:16:50-05:00
perf: Replace uses of genericLength with strictGenericLength

genericLength is a recursive function and marked NOINLINE. It is not
going to specialise. In profiles, it can be seen that 3% of total compilation
time when computing bytecode is spend calling this non-specialised
function.

In addition, we can simplify `addListToSS` to avoid traversing the input
list twice and also allocating an intermediate list (after the call to
reverse).

Overall these changes reduce the time spend in 'assembleBCOs' from 5.61s
to 3.88s. Allocations drop from 8GB to 5.3G.

Fixes #25706

- - - - -
5622a14a by Matthew Pickering at 2025-02-05T23:17:27-05:00
perf: nameToCLabel: Directly manipulate ByteString rather than going via strings

`nameToCLabel` is called from `lookupHsSymbol` many times during
bytecode linking. We can save a lot of allocations and time by directly
manipulating the bytestrings rather than going via intermediate lists.

Before: 2GB allocation, 1.11s
After: 260MB allocation, 375ms

Fixes #25719

-------------------------
Metric Decrease:
    MultiLayerModulesTH_OneShot
-------------------------

- - - - -
df49e5c4 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
GHC-CPP: first rough proof of concept

Processes

     #define FOO
     #ifdef FOO
     x = 1
     #endif

Into

    [ITcppIgnored [L loc ITcppDefine]
    ,ITcppIgnored [L loc ITcppIfdef]
    ,ITvarid "x"
    ,ITequal
    ,ITinteger (IL {il_text = SourceText "1", il_neg = False, il_value = 1})
    ,ITcppIgnored [L loc ITcppEndif]
    ,ITeof]

In time, ITcppIgnored will be pushed into a comment

- - - - -
b2a302c2 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Tidy up before re-visiting the continuation mechanic

- - - - -
c64b4f94 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Switch preprocessor to continuation passing style

Proof of concept, needs tidying up

- - - - -
a5c04c9b by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Small cleanup

- - - - -
cec8c02b by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Get rid of some cruft

- - - - -
a2859c50 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Starting to integrate.

Need to get the pragma recognised and set

- - - - -
1dc916dc by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Make cppTokens extend to end of line, and process CPP comments

- - - - -
3daf1203 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Remove unused ITcppDefined

- - - - -
1216a224 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Allow spaces between # and keyword for preprocessor directive

- - - - -
fe6ef138 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Process CPP continuation lines

They are emited as separate ITcppContinue tokens.
Perhaps the processing should be more like a comment, and keep on
going to the end.
BUT, the last line needs to be slurped as a whole.

- - - - -
35a2f2ca by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Accumulate CPP continuations, process when ready

Can be simplified further, we only need one CPP token

- - - - -
3840209f by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Simplify Lexer interface. Only ITcpp

We transfer directive lines through it, then parse them from scratch
in the preprocessor.

- - - - -
f01dc66b by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Deal with directive on last line, with no trailing \n

- - - - -
b818afbf by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Start parsing and processing the directives

- - - - -
f39b8d55 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Prepare for processing include files

- - - - -
dcaf6ec2 by Alan Zimmerman at 2025-02-06T18:12:11+00:00
Move PpState into PreProcess

And initParserState, initPragState too

- - - - -
c12e6fac by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Process nested include files

Also move PpState out of Lexer.x, so it is easy to evolve it in a ghci
session, loading utils/check-cpp/Main.hs

- - - - -
9088bf5e by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Split into separate files

- - - - -
bcdc3b91 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Starting on expression parser.

But it hangs. Time for Text.Parsec.Expr

- - - - -
d19fed4a by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Start integrating the ghc-cpp work

>From https://github.com/alanz/ghc-cpp

- - - - -
fec96f5d by Alan Zimmerman at 2025-02-06T18:12:12+00:00
WIP

- - - - -
c7860ea4 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Fixup after rebase

- - - - -
85435784 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
WIP

- - - - -
0002e194 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Fixup after rebase, including all tests pass

- - - - -
43f024fa by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Change pragma usage to GHC_CPP from GhcCPP

- - - - -
742b1d33 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Some comments

- - - - -
b7c75bd1 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Reformat

- - - - -
29c5b9b1 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Delete unused file

- - - - -
d875742d by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Rename module Parse to ParsePP

- - - - -
cb028d2e by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Clarify naming in the parser

- - - - -
adefda60 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
WIP. Switching to alex/happy to be able to work in-tree

Since Parsec is not available

- - - - -
e3156dba by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Layering is now correct

- GHC lexer, emits CPP tokens
- accumulated in Preprocessor state
- Lexed by CPP lexer, CPP command extracted, tokens concated with
  spaces (to get rid of token pasting via comments)
- if directive lexed and parsed by CPP lexer/parser, and evaluated

- - - - -
a4ccfe9e by Alan Zimmerman at 2025-02-06T18:12:12+00:00
First example working

Loading Example1.hs into ghci, getting the right results

```
{-# LANGUAGE GHC_CPP #-}
module Example1 where

y = 3

x =
  "hello"
  "bye now"

foo = putStrLn x
```

- - - - -
c1739002 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Rebase, and all tests pass except whitespace for generated parser

- - - - -
21f4e397 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
More plumbing. Ready for testing tomorrow.

- - - - -
febafc74 by Alan Zimmerman at 2025-02-06T18:12:12+00:00
Proress. Renamed module State from Types

And at first blush it seems to handle preprocessor scopes properly.

- - - - -
f2c26471 by Alan Zimmerman at 2025-02-06T23:23:13+00:00
Insert basic GHC version macros into parser

__GLASGOW_HASKELL__
__GLASGOW_HASKELL_FULL_VERSION__
__GLASGOW_HASKELL_PATCHLEVEL1__
__GLASGOW_HASKELL_PATCHLEVEL2__

- - - - -


30 changed files:

- compiler/GHC.hs
- compiler/GHC/ByteCode/Asm.hs
- compiler/GHC/ByteCode/Linker.hs
- compiler/GHC/Cmm/Lexer.x
- compiler/GHC/Cmm/Parser.y
- compiler/GHC/Cmm/Parser/Monad.hs
- compiler/GHC/CoreToStg/Prep.hs
- compiler/GHC/Driver/Backpack.hs
- compiler/GHC/Driver/Config/Parser.hs
- compiler/GHC/Driver/Flags.hs
- compiler/GHC/Driver/Main.hs
- compiler/GHC/Driver/Make.hs
- compiler/GHC/Driver/Pipeline/Execute.hs
- compiler/GHC/Driver/Session.hs
- compiler/GHC/Parser.hs-boot
- compiler/GHC/Parser.y
- compiler/GHC/Parser/Annotation.hs
- compiler/GHC/Parser/HaddockLex.x
- compiler/GHC/Parser/Header.hs
- compiler/GHC/Parser/Lexer.x
- compiler/GHC/Parser/PostProcess.hs
- compiler/GHC/Parser/PostProcess/Haddock.hs
- + compiler/GHC/Parser/PreProcess.hs
- + compiler/GHC/Parser/PreProcess/Eval.hs
- + compiler/GHC/Parser/PreProcess/Lexer.x
- + compiler/GHC/Parser/PreProcess/Macro.hs
- + compiler/GHC/Parser/PreProcess/ParsePP.hs
- + compiler/GHC/Parser/PreProcess/Parser.y
- + compiler/GHC/Parser/PreProcess/ParserM.hs
- + compiler/GHC/Parser/PreProcess/State.hs


The diff was not included because it is too large.


View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/0137afecab0854cb1cf89bef22940f28191845fc...f2c26471cb9ae42192d7ee70c66a4251070817c3

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/0137afecab0854cb1cf89bef22940f28191845fc...f2c26471cb9ae42192d7ee70c66a4251070817c3
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20250206/b0b9eb30/attachment-0001.html>


More information about the ghc-commits mailing list