From klebinger.andreas at gmx.at Wed Jul 2 20:42:18 2025 From: klebinger.andreas at gmx.at (Andreas Klebinger) Date: Wed, 2 Jul 2025 22:42:18 +0200 Subject: CI hickups Message-ID: Hello devs, There are currently issues with the perf-notes repository tokken which cause CI pipelines to fail. If you see failures related to get being unable to fetch something this is the reason. We hope to fix this soon. From matthewtpickering at gmail.com Thu Jul 3 14:16:35 2025 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Thu, 3 Jul 2025 15:16:35 +0100 Subject: CI hickups In-Reply-To: References: Message-ID: I fixed the issue now. Andreas will update the documentation so that it works next time. Cheers, Matt On Wed, Jul 2, 2025 at 9:42 PM Andreas Klebinger via ghc-devs wrote: > > Hello devs, > > There are currently issues with the perf-notes repository tokken which > cause CI pipelines to fail. > If you see failures related to get being unable to fetch something this > is the reason. > > We hope to fix this soon. > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From harendra.kumar at gmail.com Mon Jul 7 00:12:30 2025 From: harendra.kumar at gmail.com (Harendra Kumar) Date: Mon, 7 Jul 2025 05:42:30 +0530 Subject: openFile gives "file is locked" error on Linux when creating a non-existing file In-Reply-To: References: Message-ID: It is likely this GHC issue: https://gitlab.haskell.org/ghc/ghc/-/issues/18832 . On Sat, 28 Jun 2025 at 17:56, Harendra Kumar wrote: > > We instrumented the GHC locking mechanism as suggested by Viktor and > deployed the instrumented GHC in CI. We got our first crash after 4 months! > > Summary from the preliminary investigation of the crash: > > * The problem is not filesystem dependent or inode reuse related. > * After understanding it better, now I am able to reproduce it on the > local machine. > * The problem may be related to hClose getting interrupted by an async > exception. > > The GHC panic message from the instrumented code is as follows: > > 2025-06-27T09:11:41.8177372Z lockFile: first lock: pid 21804, tid > 21806, id 17 dev 2065 ino 262152 write 1 > 2025-06-27T09:11:41.8178053Z FileSystem.Event: internal error: > close: lock exists: pid 21804, tid 21806, fd 17 > 2025-06-27T09:11:41.8178492Z > 2025-06-27T09:11:41.8178604Z (GHC version 9.2.8 for x86_64_unknown_linux) > 2025-06-27T09:11:41.8179160Z Please report this as a GHC bug: > https://www.haskell.org/ghc/reportabug > > The message is coming from the "close" system call wrapper that I > created. It means we are closing the fd without releasing the lock first. > > Based on the test logs I figured that the "close" call in the panic message is > happening from this code: > > createFile :: FilePath -> FilePath -> IO () > createFile file parent = openFile (parent file) WriteMode >>= hClose > > Now the interesting thing is that in this particular test, we are > running two threads. In one thread, we are watching for file system > events related to this file, and in the other thread we are running > the createFile code snippet given above. As soon as the file gets > created we receive a file CREATED event in the first thread and we use > throwTo to send a ThreadAbort async exception to the createFile thread to > kill it. > > My theory is that if the exception is delivered at a certain point > during hClose it returns without releasing the lock but closes the > file. To test that theory I wrote the createFile thread code as shown > below, so that I keep doing open and close in a loop, to make it more > likely to interrupt hClose by the exception: > > createFile :: FilePath -> FilePath -> IO () > createFile file parent = go > where > go = > do > h <- openFile (parent file) WriteMode > hClose h > go > > Voila! With this code I am able to reproduce it locally now, though it > takes many runs (in a loop) and more than a few minutes to reproduce. > > Any ideas, whether the problem is in hClose or it may be something else? > Next, I will try a code review of hClose and instrumenting further to narrow > down the problem. > > Thanks, > Harendra > > On Sat, 16 Nov 2024 at 13:02, Viktor Dukhovni wrote: > > > > On Fri, Nov 15, 2024 at 06:45:40PM +0530, Harendra Kumar wrote: > > > > > Coming back to this issue after a break. I reviewed the code carefully > > > and I cannot find anything where we are doing something in the > > > application code that affects the RTS locking mechanism. Let me walk > > > through the steps of the test up to failure and what we are doing in > > > the code. The test output is like this: > > > > It is indeed not immediately clear where in your code or in some > > dependency (including base, GHC, ...) a descriptor that contributes to > > the RTS file reader/writer count (indexed by (dev, ino)) might be closed > > without adjusting the count by calling the RTS `unlockFile()` function > > (via GHC.IO.FD.release). > > > > It may be worth noting that GHC does not *reliably* prevent simultaneous > > open handles for the same underlying file, because handles returned by > > hDuplicate do not contribute to the count: > > > > demo.hs: > > import GHC.IO.Handle (hDuplicate) > > import System.IO > > > > main :: IO () > > main = do > > fh1 <- dupOpen "/tmp/foo" > > fh2 <- dupOpen "/tmp/foo" > > writeNow fh1 "abc\n" > > writeNow fh2 "def\n" > > readFile "/tmp/foo" >>= putStr > > hClose fh1 > > hClose fh2 > > where > > -- Look Mom, no lock! > > dupOpen path = do > > fh <- openFile path WriteMode > > hDuplicate fh <* hClose fh > > > > writeNow fh s = hPutStr fh s >> hFlush fh > > > > $ ghc -O demo.hs > > [1 of 2] Compiling Main ( demo.hs, demo.o ) > > [2 of 2] Linking demo > > $ ./demo > > def > > > > I am not sure that Haskell really should be holding the application's > > hand in this area, corrupting output files by concurrent writers can > > just as easily happen by running two independent processes. But letting > > go of this guardrail would IIRC be a deviation from the Haskell report, > > and there are likely applications that depend on this (and don't use > > hDupicate or equivalent to break the reader/writer tracking). > > > > -- > > Viktor. > > _______________________________________________ > > ghc-devs mailing list > > ghc-devs at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From simon.peytonjones at gmail.com Fri Jul 11 09:33:08 2025 From: simon.peytonjones at gmail.com (Simon Peyton Jones) Date: Fri, 11 Jul 2025 10:33:08 +0100 Subject: Reproducing a build failure Message-ID: Friends In !14410, specifically on Fedora, I get this build failure . Some kind of crash in some invocation of Haddock to make the docs for ghc-internal. Question: how can I reproduce this failure on my own machine? It builds fine for me. Thanks Simon Error when running Shake build system: at want, called at src/Main.hs:127:44 in main:Main * Depends on: binary-dist at need, called at src/Rules/BinaryDist.hs:311:13 in main:Rules.BinaryDist * Depends on: binary-dist-dir at need, called at src/Rules/BinaryDist.hs:168:9 in main:Rules.BinaryDist * Depends on: docs at need, called at src/Rules/Documentation.hs:137:9 in main:Rules.Documentation * Depends on: _build/doc/html/index.html at need, called at src/Rules/Documentation.hs:214:9 in main:Rules.Documentation * Depends on: _build/doc/html/libraries/index.html at need, called at src/Rules/Documentation.hs:251:9 in main:Rules.Documentation * Depends on: _build/doc/html/libraries/ghc-internal-9.1400.0-9f99/ghc-internal.haddock at cmd, called at src/Builder.hs:404:5 in main:Builder * Raised the exception: Development.Shake.cmd, system command failed Command line: _build/stage1/bin/haddock --verbosity=0 -B_build/stage1/lib --lib=_build/stage1/lib --odir=_build/doc/html/libraries/ghc-internal-9.1400.0-9f99 --dump-interface=_build/doc/html/libraries/ghc-internal-9.1400.0-9f99/ghc-internal.haddock --html '--title=ghc-internal-9.1400.0: Basic libraries' --prologue=_build/doc/html/libraries/ghc-internal-9.1400.0-9f99/haddock-prologue.txt --optghc=-D__HADDOCK_VERSION__=2300 --hide=GHC.Internal.Data.Typeable.Internal --hide=GHC.Internal.IO.Handle.Lock.Common --hide=GHC.Internal.IO.Handle.Lock.Flock --hide=GHC.Internal.IO.Handle.Lock.LinuxOFD --hide=GHC.Internal.IO.Handle.Lock.NoOp --hide=GHC.Internal.IO.Handle.Lock.Windows --hide=GHC.Internal.StaticPtr.Internal --hide=GHC.Internal.Event.Arr --hide=GHC.Internal.Event.Array --hide=GHC.Internal.Event.Internal --hide=GHC.Internal.Event.Internal.Types --hide=GHC.Internal.Event.IntTable --hide=GHC.Internal.Event.IntVar --hide=GHC.Internal.Event.PSQ --hide=GHC.Internal.Event.Unique --hide=GHC.Internal.Unicode.Bits --hide=GHC.Internal.Unicode.Char.DerivedCoreProperties --hide=GHC.Internal.Unicode.Char.UnicodeData.GeneralCategory --hide=GHC.Internal.Unicode.Char.UnicodeData.SimpleLowerCaseMapping --hide=GHC.Internal.Unicode.Char.UnicodeData.SimpleTitleCaseMapping --hide=GHC.Internal.Unicode.Char.UnicodeData.SimpleUpperCaseMapping --hide=GHC.Internal.Unicode.Version --hide=GHC.Internal.System.Environment.ExecutablePath --hide=GHC.Internal.Bignum.Backend.GMP --hide=GHC.Internal.Event.Control --hide=GHC.Internal.Event.EPoll --hide=GHC.Internal.Event.KQueue --hide=GHC.Internal.Event.Manager --hide=GHC.Internal.Event.Poll --hide=GHC.Internal.Event.Thread --hide=GHC.Internal.Event.TimerManager --optghc=-hisuf --optghc=dyn_hi --optghc=-osuf --optghc=dyn_o --optghc=-hcsuf --optghc=dyn_hc --optghc=-fPIC --optghc=-dynamic --optghc=-hide-all-packages --optghc=-no-user-package-db '--optghc=-package-env -' '--optghc=-this-unit-id ghc-internal-9.1400.0-9f99' '--optghc=-this-package-name ghc-internal' '--optghc=-package-id rts-1.0.3' --optghc=-i --optghc=-i/builds/ghc/ghc/_build/stage1/libraries/ghc-internal/build --optghc=-i/builds/ghc/ghc/_build/stage1/libraries/ghc-internal/build/autogen --optghc=-i/builds/ghc/ghc/libraries/ghc-internal/src --optghc=-Irts/include --optghc=-I_build/stage1/libraries/ghc-internal/build --optghc=-I_build/stage1/libraries/ghc-internal/build/include --optghc=-Ilibraries/ghc-internal/include --optghc=-I/builds/ghc/ghc/rts/include --optghc=-I/builds/ghc/ghc/_build/stage1/rts/build/include --optghc=-optP-include --optghc=-optP_build/stage1/libraries/ghc-internal/build/autogen/cabal_macros.h --optghc=-optP-DBIGNUM_GMP --optghc=-outputdir --optghc=_build/stage1/libraries/ghc-internal/build --optghc=-this-unit-id --optghc=ghc-internal --optghc=-Wcompat --optghc=-Wnoncanonical-monad-instances --optghc=-XHaskell2010 --optghc=-XNoImplicitPrelude --optghc=-no-global-package-db --optghc=-package-db=/builds/ghc/ghc/_build/stage1/inplace/package.conf.d --optghc=-ghcversion-file=rts/include/ghcversion.h --optghc=-ghcversion-file=rts/include/ghcversion.h --optghc=-this-unit-id --optghc=ghc-internal --optghc=-Wcompat --optghc=-Wnoncanonical-monad-instances --optghc=-XHaskell2010 --optghc=-XNoImplicitPrelude --optghc=-no-global-package-db --optghc=-package-db=/builds/ghc/ghc/_build/stage1/inplace/package.conf.d --optghc=-ghcversion-file=rts/include/ghcversion.h --optghc=-ghcversion-file=rts/include/ghcversion.h --optghc=-Wno-deprecated-flags --optghc=-Wno-trustworthy-safe +RTS -t_build/stage1/haddock-timing-files/ghc-internal.t --machine-readable -RTS --hyperlinked-source --hoogle --quickjump @/tmp/extra-file-54706448418404-4067-11690 Exit code: 1 Stderr: : error: panic! (the 'impossible' happened) GHC version 9.14.0.20250710: idDetails a_alNe Call stack: CallStack (from HasCallStack): pprPanic, called at compiler/GHC/Types/Var.hs:1079:43 in ghc-9.14.0.20250710-7d78:GHC.Types.Var -------------- next part -------------- An HTML attachment was scrubbed... URL: From facundo.dominguez at tweag.io Mon Jul 14 12:28:42 2025 From: facundo.dominguez at tweag.io (=?UTF-8?Q?Facundo_Dom=C3=ADnguez?=) Date: Mon, 14 Jul 2025 09:28:42 -0300 Subject: CStub in inline-java Message-ID: Dear devs, I upgraded inline-java [1] to use GHC 9.10.2 recently. inline-java has a GHC plugin responsible for embedding JVM bytecode in Haskell modules. This message is to consult on the most appropriate way to do it. The byte code is placed in a C literal array of bytes, and the C code is added in mg_foreign together with C constructor functions that run when loading the module to add the bytecode to a global bytecode table [2]. The global bytecode table is then fed to the JVM when initializing inline-java. The GHC API in 9.10.2 requires creating a CStub which does have a field to indicate constructor functions [3], but I wouldn't know how to create the CLabels it needs. So instead, I just put the constructor functions in the getCStub field, and set getInitializers to an empty list. The constructor functions seem to run at initialization just the same. Should inline-java do this differently? Thanks in advance! Facundo [1] https://github.com/tweag/inline-java [2] https://github.com/tweag/inline-java/blob/c6b590c29ef9190164dec3a235b8e386c537bd29/src/Language/Java/Inline/Plugin.hs#L87 [3] https://gitlab.haskell.org/ghc/ghc/-/blob/3c37d30b07fc85fe09452f4ce250aec42cb1d2e4/compiler/GHC/Types/ForeignStubs.hs#L23 data CStub = CStub { getCStub :: SDoc , getInitializers :: [CLabel] -- ^ Initializers to be run at startup -- See Note [Initializers and finalizers in Cmm] in -- "GHC.Cmm.InitFini". , getFinalizers :: [CLabel] -- ^ Finalizers to be run at shutdown } -- All views and opinions expressed in this email message are the personal opinions of the author and do not represent those of the organization or its customers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From klebinger.andreas at gmx.at Tue Jul 15 16:47:32 2025 From: klebinger.andreas at gmx.at (Andreas Klebinger) Date: Tue, 15 Jul 2025 18:47:32 +0200 Subject: GHC LTS Releases Message-ID: <228789da-caf2-4c9c-ab82-66736f6da4a9@gmx.at> I'm pleased to announce LTS plans for GHC Read all the details here: https://www.haskell.org/ghc/blog/20250702-ghc-release-schedules.html There is also a discourse thread if you have feedback: https://discourse.haskell.org/t/ghc-lts-releases-the-glasgow-haskell-compiler/12469 Best Wishes Andreas From matthewtpickering at gmail.com Fri Jul 18 08:50:49 2025 From: matthewtpickering at gmail.com (Matthew Pickering) Date: Fri, 18 Jul 2025 09:50:49 +0100 Subject: Improving the GHC proposals process Message-ID: Hi all, It seems to me that the general opinion is that GHC developers are reluctant to take part in the GHC proposals process. This is an unfortunate state of affairs, the proposals process should be one serving developers rather than a hindrance. Therefore I have started a discussion on the ghc-proposals issue tracker about some ways which it might be made better. https://github.com/ghc-proposals/ghc-proposals/issues/706 Please leave your thoughts there, on how you feel about the process in general or concretely about the suggestions I have made. Cheers, Matt From haskell at stefan-klinger.de Sun Jul 20 19:12:20 2025 From: haskell at stefan-klinger.de (Stefan Klinger) Date: Sun, 20 Jul 2025 21:12:20 +0200 Subject: Correct parsers for bounded integral values Message-ID: Hello, I'd like to bring to your attention a discussion that I have started over at Haskell-cafe [1]. I was complaining about the silent overflow of parsers for bounded integers: > read "298" :: Word8 42 I find this unsatisfying, and I have demonstrated a solution [2] that seems correct and performant. To avoid cross posting, I'd rather point you to the discussion on the Haskell-cafe mailing list [1]. I would like to help get this into GHC and also other parsing libraries. Not all libraries suffer from this issue, but some do, including GHC and its `base` package. How can I attempt this? Should I file a bug report? Or try to follow the instructions at ghc.dev [3] and submit a pull request? I would start tinkering at [this point][4] as soon as I manage to compile GHC. Would you want to share any advice beforehand? I'd really appreciate to be stopped before wasting effort ;-) Also, I probably only can work on this on weekends. Cheers Stefan [1]: https://mail.haskell.org/pipermail/haskell-cafe/2025-July/137134.html [2]: https://github.com/s5k6/robust-int [3]: https://ghc.dev/ [4]: https://hackage.haskell.org/package/ghc-internal-9.1201.0/docs/src/GHC.Internal.Read.html#line-586 -- Stefan Klinger, Ph.D. -- computer scientist o/X http://stefan-klinger.de /\/ https://github.com/s5k6 \ I prefer receiving plain text messages, not exceeding 32kB. From rodrigo.m.mesquita at gmail.com Sun Jul 20 19:34:50 2025 From: rodrigo.m.mesquita at gmail.com (Rodrigo Mesquita) Date: Sun, 20 Jul 2025 20:34:50 +0100 Subject: Correct parsers for bounded integral values In-Reply-To: References: Message-ID: <7A193809-A28D-4DA9-9CE8-02CE6C6C6752@gmail.com> Hi Stefan, this looks like great (and well explained) work! To modify the behavior of existing libraries you need to tackle the change per library and upstream it by opening a ticket on the associated issue tracker and discussing with the developers/maintainers. For `base` itself there is a standardized process to introduce changes to the behavior of the standard library. It’s called a Core Libraries Committee (CLC) Proposal[1]. Since you have already written up very precisely the problem, existing behavior, and proposed change, you should be able to copy it relatively verbatim to a ticket on the CLC to begin an official discussion. That said, it is very good you have posted it to haskell-cafe and to ghc-devs, as that hopefully makes a wider audience interested in your proposal and gives them a chance to discuss it publicly before starting the formal process of a proposal. Regarding the implementation for `base` specifically, you can indeed open a bug report to discuss with ghc and core libraries devs the issue and implementation — that said, the final decision regarding changes to `base` lies with the CLC and has to go through the proposal process. Typically these kinds of tickets also serve to track the upstream CLC proposal once it’s open. Submitting a MR sounds like a good idea as well. You may find it interesting to get GHC compiling, your change applied, and see how CI responds to it :). The place you pointed at in the `base` module looks about right. By the way, if you find any trouble with compiling GHC or applying your change, do reach out and we can help you get it working. You can ping me (@alt-romes) on your Merge Request, or e.g. shoot an email here. I don’t have a strong opinion regarding `read` for `Word8`, so I hope other devs may comment on the contents of your proposal more specifically. Good luck! Rodrigo Mesquita [1]: https://github.com/haskell/core-libraries-committee > On 20 Jul 2025, at 20:12, Stefan Klinger wrote: > > Hello, > > I'd like to bring to your attention a discussion that I have started > over at Haskell-cafe [1]. I was complaining about the silent overflow > of parsers for bounded integers: > >> read "298" :: Word8 > 42 > > I find this unsatisfying, and I have demonstrated a solution [2] that > seems correct and performant. > > To avoid cross posting, I'd rather point you to the discussion on the > Haskell-cafe mailing list [1]. > > I would like to help get this into GHC and also other parsing > libraries. Not all libraries suffer from this issue, but some do, > including GHC and its `base` package. > > How can I attempt this? Should I file a bug report? Or try to follow > the instructions at ghc.dev [3] and submit a pull request? I would > start tinkering at [this point][4] as soon as I manage to compile GHC. > > Would you want to share any advice beforehand? I'd really appreciate > to be stopped before wasting effort ;-) Also, I probably only can work > on this on weekends. > > Cheers > Stefan > > > [1]: https://mail.haskell.org/pipermail/haskell-cafe/2025-July/137134.html > [2]: https://github.com/s5k6/robust-int > [3]: https://ghc.dev/ > [4]: https://hackage.haskell.org/package/ghc-internal-9.1201.0/docs/src/GHC.Internal.Read.html#line-586 > > > -- > Stefan Klinger, Ph.D. -- computer scientist o/X > http://stefan-klinger.de /\/ > https://github.com/s5k6 \ > I prefer receiving plain text messages, not exceeding 32kB. > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs From ietf-dane at dukhovni.org Sun Jul 20 20:08:59 2025 From: ietf-dane at dukhovni.org (Viktor Dukhovni) Date: Mon, 21 Jul 2025 06:08:59 +1000 Subject: Correct parsers for bounded integral values In-Reply-To: References: Message-ID: On Sun, Jul 20, 2025 at 09:12:20PM +0200, Stefan Klinger wrote: > I'd like to bring to your attention a discussion that I have started > over at Haskell-cafe [1]. I was complaining about the silent overflow > of parsers for bounded integers: > > > read "298" :: Word8 > 42 FWIW, there haven't AFAIK any complaints about ByteString's readInt, readWord, readInteger, readNatural and various sized variants having overflow checks. But these have always been more like `reads` than `read`, returning `Maybe (a, ByteString)`, so perhaps somewhat more oriented towards detecting unexpected excess input, as well as for some time now range overflow. So there's some precedent for overflow checking, but... It is also fair to point out that once an Int or other bounded integral type is read, arithmetic with that type (addition, subtraction and multiplication) silently overflows. And so silent overflow in `read` is not inconsistent with the type's semantics. If converting strings to numbers is in support of string-oriented network protocols (e.g. the SIZE ESMTP extension), then one really should make an effort to avoid silent overflow, but in that context the various ByteString read methods are already available. That said, if various middleware libraries hide overflows, because under the covers thay're using `read`, that could be a problem, so we do want the ecosystem at large to make sensible choices about when silent overflow may or may not be appropriate. Perhaps that means having both wrapping and overflow-checked implementations available, and clear docs with each about its behaviour and the corresponding alternative. > I find this unsatisfying, and I have demonstrated a solution [2] that > seems correct and performant. A few of quick observations about [2]: - It disallows expliccit leading "+" (just like "read", but perhaps that should be tolerated). - It disallows multiple leading zeros, perhaps these should be tolerated. - It disallows "-0", perhaps these should be tolerated, as well as "-0000", "-000001", ... (With lazy ByteStrings, which might never terminate, there is a generous, but sensible limit on the number of leading zeros allowed). - One way to avoid difficulties with handling negative minBound is to parse signed values via the corresponding unsigned type, which can accommodate `-minBound` as a positive value, and then negate the final result. This makse possible sharing the low-level digit-by-digit code between the positive and negative cases. If parsing of Integer and Natual is also in scope, I would expect that it avoids doing multi-precision arithmetic for each digit, parsing groups of digits into ~Word sized blocks, and merge the blocks hierarchically with only a logarithmic number of MP multiplies. -- Viktor. From haskell at stefan-klinger.de Mon Jul 21 15:56:46 2025 From: haskell at stefan-klinger.de (Stefan Klinger) Date: Mon, 21 Jul 2025 17:56:46 +0200 Subject: Correct parsers for bounded integral values In-Reply-To: References: Message-ID: Thanks for the encouragement Rodrigo! I'll follow the process and hope to open a ticket soon. Viktor Dukhovni (2025-Jul-21, excerpt): > It is also fair to point out that once an Int or other bounded integral > type is read, arithmetic with that type (addition, subtraction and > multiplication) silently overflows. And so silent overflow in `read` > is not inconsistent with the type's semantics. I see parsing as a boundary between an outside world (throwing text at me) and an inside world, where I have programmed some algorithm. As programmer, it is my responsibility to ensure that the types are chosen so that the algorithm works correctly, ideally on any accepted input, i.e., I have to guarantee that no inadvertent overflow happens in this inside world. However, calculating away based on misinterpreted input, will lead to invalid results. Viktor Dukhovni (2025-Jul-21, excerpt): > That said, if various middleware libraries hide overflows, because under > the covers thay're using `read`, that could be a problem, so we do want > the ecosystem at large to make sensible choices about when silent > overflow may or may not be appropriate. Perhaps that means having > both wrapping and overflow-checked implementations available, and > clear docs with each about its behaviour and the corresponding > alternative. I did not realise this clearly enough before, but have elaborated a bit on Haskell-cafe [1]. We do have unbounded `read :: String -> Integer` and silently overflowing `fromInteger :: Integer -> Word8`, which can be combined if overflow is desired. This follows the idea to be explicit about dangerous things. In addition, we have `read :: String -> Word8` and company, which I'd like to fix. > A few of quick observations about [2]: Thank you =) > - It disallows expliccit leading "+" (just like "read", but perhaps > that should be tolerated). Yes, it probably should not be that strict. For my own projects I assumed it easier to make it more forgiving later, than the other way round. There really should be consensus on whether or not leading `+` or `0` should be allowed. But these are fixes to make towards the end, I guess. > - It disallows multiple leading zeros, perhaps these should be > tolerated. > > - It disallows "-0", perhaps these should be tolerated, as well > as "-0000", "-000001", ... (With lazy ByteStrings, which might > never terminate, there is a generous, but sensible limit on > the number of leading zeros allowed). I ruled this out because I wanted a simple guarantee for termination. Your idea of “generous, but sensible” sounds compelling, the leading `0`s can be cosumed in constant space, we need not keep them. > - One way to avoid difficulties with handling negative minBound is > to parse signed values via the corresponding unsigned type, which > can accommodate `-minBound` as a positive value, and then negate > the final result. This makse possible sharing the low-level > digit-by-digit code between the positive and negative cases. How do you mean? I did not get this “accommodate `-minBound` as a positive value” right, my initial approach to use char '-' >> negate <$> parseUnsigned (negate minBound) fails, exactly because the negation of the lower bound may not be (read: is usually not) within the upper bound, and thus wraps around, e.g., incorrectly `negate (minBound :: Int8)` → `-128` due to the upper bound of `127`. Viktor Dukhovni (2025-Jul-21, excerpt): > If parsing of Integer and Natual is also in scope […] No, not at all. I have no reservations against `read` for the unbounded types. That should be left alone. Cheers Stefan [1]: https://mail.haskell.org/pipermail/haskell-cafe/2025-July/137162.html [2]: https://github.com/s5k6/robust-int -- Stefan Klinger, Ph.D. -- computer scientist o/X http://stefan-klinger.de /\/ https://github.com/s5k6 \ I prefer receiving plain text messages, not exceeding 32kB. From ietf-dane at dukhovni.org Mon Jul 21 18:01:25 2025 From: ietf-dane at dukhovni.org (Viktor Dukhovni) Date: Tue, 22 Jul 2025 04:01:25 +1000 Subject: Correct parsers for bounded integral values In-Reply-To: References: Message-ID: On Mon, Jul 21, 2025 at 05:56:46PM +0200, Stefan Klinger wrote: > > - One way to avoid difficulties with handling negative minBound is > > to parse signed values via the corresponding unsigned type, which > > can accommodate `-minBound` as a positive value, and then negate > > the final result. This makse possible sharing the low-level > > digit-by-digit code between the positive and negative cases. > > How do you mean? I did not get this “accommodate `-minBound` as a > positive value” right, my initial approach to use > > char '-' >> negate <$> parseUnsigned (negate minBound) > > fails, exactly because the negation of the lower bound may not be > (read: is usually not) within the upper bound, and thus wraps around, > e.g., incorrectly `negate (minBound :: Int8)` → `-128` due to the > upper bound of `127`. Timing is everything, you're trying to do the negation while the value is still signed, instead it is necessary to convert it to a Natural while also keeping track of the sign! {-# LANGUAGE RequiredTypeArguments #-} -- Ideally, compute both the sign and the absolute value in one go. minBoundAbs :: forall a -> (Bounded a, Integral a) => Natural minBoundAbs a = fromIntegral @Integer $ abs $ fromIntegral @a minBound maxBoundAbs :: forall a -> (Bounded a, Integral a) => Natural maxBoundAbs a = fromIntegral @Integer $ abs $ fromIntegral @a maxBound isNegative :: forall a -> (Bounded a, Integral a) => a -> Bool isNegative x = fromIntegral @Integer x < 0 That said, the Bytestring code does not do that, rather it simply knows that the absolute values of minBound and maxBound of the 8, 16, 32 and 64 bit signed integral types differ only in their last digit, and the last digit of minBound is always 8, while for maxBound it is always 7. For the unsigned types the last digit is always 5. Since you're writing polymorphic parser code, rather than separate logical functions for each of the fundamental integral types, you'll need to take the high road and compute the absolute value of each bound as above while keeping track of its sign. To handle arbitrary Bounded, Integral types, the logic gets a bit more complex, because the upper bound could also be negative, or the lower bound could be non-negative, requiring some care in overflow safe: safeRead :: (Bounded a, Integral a) => String -> a safeReads :: (Bounded a, Integral a) => String -> [(a, String)] -- VIktor. From klebinger.andreas at gmx.at Mon Jul 21 18:54:06 2025 From: klebinger.andreas at gmx.at (Andreas Klebinger) Date: Mon, 21 Jul 2025 20:54:06 +0200 Subject: Correct parsers for bounded integral values In-Reply-To: References: Message-ID: <231ada30-c81e-4544-8f16-d5b8dfcb2f29@gmx.at> For base introducing a new function `readBoundedNum :: (Bounded a, Num a) => String -> a` or similar seems very reasonable to me. Changing "read" to throw an exception or similar after decades less so. On 20/07/2025 22:08, Viktor Dukhovni wrote: > On Sun, Jul 20, 2025 at 09:12:20PM +0200, Stefan Klinger wrote: > >> I'd like to bring to your attention a discussion that I have started >> over at Haskell-cafe [1]. I was complaining about the silent overflow >> of parsers for bounded integers: >> >> > read "298" :: Word8 >> 42 > FWIW, there haven't AFAIK any complaints about ByteString's readInt, > readWord, readInteger, readNatural and various sized variants having > overflow checks. But these have always been more like `reads` than > `read`, returning `Maybe (a, ByteString)`, so perhaps somewhat more > oriented towards detecting unexpected excess input, as well as for > some time now range overflow. So there's some precedent for overflow > checking, but... > > It is also fair to point out that once an Int or other bounded integral > type is read, arithmetic with that type (addition, subtraction and > multiplication) silently overflows. And so silent overflow in `read` > is not inconsistent with the type's semantics. > > If converting strings to numbers is in support of string-oriented > network protocols (e.g. the SIZE ESMTP extension), then one really > should make an effort to avoid silent overflow, but in that context the > various ByteString read methods are already available. > > That said, if various middleware libraries hide overflows, because under > the covers thay're using `read`, that could be a problem, so we do want > the ecosystem at large to make sensible choices about when silent > overflow may or may not be appropriate. Perhaps that means having > both wrapping and overflow-checked implementations available, and > clear docs with each about its behaviour and the corresponding > alternative. > >> I find this unsatisfying, and I have demonstrated a solution [2] that >> seems correct and performant. > A few of quick observations about [2]: > > - It disallows expliccit leading "+" (just like "read", but perhaps > that should be tolerated). > > - It disallows multiple leading zeros, perhaps these should be > tolerated. > > - It disallows "-0", perhaps these should be tolerated, as well > as "-0000", "-000001", ... (With lazy ByteStrings, which might > never terminate, there is a generous, but sensible limit on > the number of leading zeros allowed). > > - One way to avoid difficulties with handling negative minBound is > to parse signed values via the corresponding unsigned type, which > can accommodate `-minBound` as a positive value, and then negate > the final result. This makse possible sharing the low-level > digit-by-digit code between the positive and negative cases. > > If parsing of Integer and Natual is also in scope, I would expect that > it avoids doing multi-precision arithmetic for each digit, parsing > groups of digits into ~Word sized blocks, and merge the blocks > hierarchically with only a logarithmic number of MP multiplies. > From ben at well-typed.com Tue Jul 22 13:01:06 2025 From: ben at well-typed.com (Ben Gamari) Date: Tue, 22 Jul 2025 09:01:06 -0400 Subject: GHC 9.14.1 release status Message-ID: <87ms8wcrch.fsf@smart-cactus.org> Hi all, Last week GHC 9.14 officially forked. Over the past week I have been working to get the branch into a releasable state. Unfortunately, there are currently issues with the Windows release target that which are preventing preparation of the first alpha. I hope to have this resolved by the end of this week, allowing our first pre-release in the middle of next week. Ultimately, it is not unlikely that the release schedule will slip a bit as a result of these Windows issues. We will keep the lists informed as the situation develops. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 255 bytes Desc: not available URL: From iavor.diatchki at gmail.com Tue Jul 22 15:31:25 2025 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Tue, 22 Jul 2025 08:31:25 -0700 Subject: Correct parsers for bounded integral values In-Reply-To: <231ada30-c81e-4544-8f16-d5b8dfcb2f29@gmx.at> References: <231ada30-c81e-4544-8f16-d5b8dfcb2f29@gmx.at> Message-ID: Hello, I also think that the instances for the bounded types are pretty unfortunate, but the change might have unintended consequences. I am not particularly opposed to it though. One thing to consider, though, is that it might be more productive to change the other parsing libraries (parsec, etc). For example, I almost never use ReadP for actual parsing that requires validation: it is slow and has no error reporting. I've only really used it for quick and dirty serialization in combination with Show, and there the problem is less likely to happen. Cheers, Iavor On Mon, Jul 21, 2025, 11:54 AM Andreas Klebinger via ghc-devs < ghc-devs at haskell.org> wrote: > For base introducing a new function `readBoundedNum :: (Bounded a, Num > a) => String -> a` or similar seems very reasonable to me. > Changing "read" to throw an exception or similar after decades less so. > > > On 20/07/2025 22:08, Viktor Dukhovni wrote: > > On Sun, Jul 20, 2025 at 09:12:20PM +0200, Stefan Klinger wrote: > > > >> I'd like to bring to your attention a discussion that I have started > >> over at Haskell-cafe [1]. I was complaining about the silent overflow > >> of parsers for bounded integers: > >> > >> > read "298" :: Word8 > >> 42 > > FWIW, there haven't AFAIK any complaints about ByteString's readInt, > > readWord, readInteger, readNatural and various sized variants having > > overflow checks. But these have always been more like `reads` than > > `read`, returning `Maybe (a, ByteString)`, so perhaps somewhat more > > oriented towards detecting unexpected excess input, as well as for > > some time now range overflow. So there's some precedent for overflow > > checking, but... > > > > It is also fair to point out that once an Int or other bounded integral > > type is read, arithmetic with that type (addition, subtraction and > > multiplication) silently overflows. And so silent overflow in `read` > > is not inconsistent with the type's semantics. > > > > If converting strings to numbers is in support of string-oriented > > network protocols (e.g. the SIZE ESMTP extension), then one really > > should make an effort to avoid silent overflow, but in that context the > > various ByteString read methods are already available. > > > > That said, if various middleware libraries hide overflows, because under > > the covers thay're using `read`, that could be a problem, so we do want > > the ecosystem at large to make sensible choices about when silent > > overflow may or may not be appropriate. Perhaps that means having > > both wrapping and overflow-checked implementations available, and > > clear docs with each about its behaviour and the corresponding > > alternative. > > > >> I find this unsatisfying, and I have demonstrated a solution [2] that > >> seems correct and performant. > > A few of quick observations about [2]: > > > > - It disallows expliccit leading "+" (just like "read", but perhaps > > that should be tolerated). > > > > - It disallows multiple leading zeros, perhaps these should be > > tolerated. > > > > - It disallows "-0", perhaps these should be tolerated, as well > > as "-0000", "-000001", ... (With lazy ByteStrings, which might > > never terminate, there is a generous, but sensible limit on > > the number of leading zeros allowed). > > > > - One way to avoid difficulties with handling negative minBound is > > to parse signed values via the corresponding unsigned type, which > > can accommodate `-minBound` as a positive value, and then negate > > the final result. This makse possible sharing the low-level > > digit-by-digit code between the positive and negative cases. > > > > If parsing of Integer and Natual is also in scope, I would expect that > > it avoids doing multi-precision arithmetic for each digit, parsing > > groups of digits into ~Word sized blocks, and merge the blocks > > hierarchically with only a logarithmic number of MP multiplies. > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From diegorosario2013 at gmail.com Mon Jul 28 00:16:04 2025 From: diegorosario2013 at gmail.com (Diego Antonio Rosario Palomino) Date: Sun, 27 Jul 2025 19:16:04 -0500 Subject: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) Message-ID: Hello GHC devs, I'm currently working on Cmm documentation and tooling improvements as part of my Google Summer of Code project. One of my core goals is to make Cmm roundtrip serializable. Right now, the in-memory Cmm data structure—generated programmatically (e.g., from STG via GHC)—can be pretty-printed, and Cmm can also be parsed. However, the pretty-printed version is not compatible with the parser. That is, we cannot take the output of the pretty printer and feed it directly back into the parser. Example: Parseable version: sum { cr: bits64 x; x = R1 + R2; R1 = x; jump %ENTRY_CODE(Sp(0))[R1]; } Pretty-printed version: sum() { // [] { info_tbls: [] stack_info: arg_space: 8 } {offset cf: // global _ce::I64 = R1 + R2; R1 = _ce::I64; call (I64[Sp + 0 * 8])(R1) args: 8, res: 0, upd: 8; } } Another example: Parseable version: simple_sum_4 { // [R2, R1] cr: // global bits64 _cq; _cq = R2; bits64 _cp; _cp = R1; R1 = _cq + _cp; jump (bits64[Sp])[R1]; } Pretty-printed version: simple_sum_4() { // [] { info_tbls: [] stack_info: arg_space: 8 } {offset cs: // global _cq::I64 = R2; _cr::I64 = R1; R1 = _cq::I64 + _cr::I64; call (I64[Sp])(R1) args: 8, res: 0, upd: 8; } } While it’s possible to write parseable Cmm that resembles the pretty-printed version (and hence the internal ADT), they don’t fully match—mainly because the parser inserts inferred fields using convenience functions. Proposal: To make roundtrip serialization possible, I propose supporting a new syntax that matches the pretty printer output exactly. There are a couple of design options: 1. Create a separate parser that accepts the pretty-printed syntax. Files could then use either the current parser or the new strict one. 2. Extend the current parser with a dedicated block syntax like: low_level_unwrapped { ... } This second option is the one my mentor recommends, as it may better reflect GHC developers' preferences. In this mode, the parser would not insert any inferred data and would expect the input to match the pretty-printed form exactly. This would enable a true roundtrip: - Compile Haskell to Cmm (in-memory AST) - Pretty-print and write it to disk (wrapped in low_level_unwrapped { ... }) - Later read it back using the parser and continue with codegen Optional future direction: As a side note: currently the parser has both a “high-level” and a “low-level” mode. The low-level mode resembles the AST more closely but still inserts some inferred data. If we introduce this new “exact” low-level form, it's possible the existing low-level mode could become redundant. We might then have: - High-level syntax - New low-level (exact) - And possibly deprecate the current low-level variant I’d be interested in your thoughts on whether that direction makes sense. Serialization libraries? One technically possible—but likely unacceptable—alternative would be to derive serialization via a library like aeson. That would enable serializing and deserializing the Cmm AST directly. However, I understand that aeson adds a large dependency footprint, and likely wouldn't be suitable for inclusion in GHC. Final question: Lastly—I’ve heard that parts of the Cmm pipeline may currently be under refactoring. If that’s the case, could you point me to which parts (parser, pretty printer, internal representation, etc.) are being modified? I’d like to align my efforts accordingly and avoid conflicts. Thanks very much for your time and input! I'm happy to iterate on this based on your feedback. Best regards, Diego Antonio Rosario Palomino GSoC 2025 – Cmm Documentation & Tooling -------------- next part -------------- An HTML attachment was scrubbed... URL: From klebinger.andreas at gmx.at Mon Jul 28 07:46:21 2025 From: klebinger.andreas at gmx.at (Andreas Klebinger) Date: Mon, 28 Jul 2025 09:46:21 +0200 Subject: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) In-Reply-To: References: Message-ID: The idea of making Cmm roundtripable comes up every now and then. While the ability to feed dump output to GHC for debugging or similar purposes is useful In the end we always ended up prioritizing one of the many other things that needed doing. Or in other words making Cmm (more) roundtripable seems inherently useful. However it's questionably how much it is worth breaking things like .cmm code that exists in libraries for it. So if you want to work towards this it should be with the goal to avoid breakage. There are likely also a lot of corner cases to consider. Which might make this more complicated then it sounds. Ultimately this is up to you and your mentor. But if I understand correctly you have about 5 weeks left for GSoC so getting full Cmm roundtrip ability into a state where it can be merged into GHC during that time might be too optimistic depending on your haskell/parser/GHC experience. As a GHC maintainer for us the most useful thing therefore would be incremental patches which take Cmm closer to being roundtripable. And that would allow you to get at least some work that benefits the GHC project into the tree even if you end up not making it all the way to full roundtrip capability. On the pure technical aspects: ------------- > > Create a separate parser ... 1. Creating a separate parser is not viable. It would likely bitrot and break on the next change to Cmm and only causes increased maintenance overhead. At least not if you want the GHC team to maintain it. > Extend the current parser with a dedicated block Having blocks ala C seems fine. Your suggestion seems different however. It's unclear from your example how those blocks would work exactly. Is `|low_level_unwrapped` |a label. If so can we goto to it? Is it a keyword? Something else entirely? If the main issue is the "offset" string in the generated case I'm fine with deleting that from the pretty printer. I'm not sure that does anything of value so removing it from the output seems fine. (See pprCmmGraph). > If we introduce this new “exact” low-level form, it's possible the existing low-level mode could become redundant. We might then have: What changes are you planning that make the new parser/syntax incompatible with the old one? Can't you just modify the current parser, maybe with some slight changes to the pretty printer, in a way that makes it mostly backwards compatible? > |aeson| adds a large dependency footprint, and likely wouldn't be suitable for inclusion in GHC. Yes aeson seems unsuitable. > Lastly—I’ve heard that parts of the Cmm pipeline may currently be under refactoring. This is the first time I hear of this so I wonder where this information came from? There could always be changes to those sorts of things, because at the end of the day they are compiler internals. But I'm not aware of any big planned changes in the near future. Cheers Andreas On 28/07/2025 02:16, Diego Antonio Rosario Palomino wrote: > > Hello GHC devs, > > I'm currently working on Cmm documentation and tooling improvements as > part of my Google Summer of Code project. One of my core goals is to > make Cmm roundtrip serializable. > > Right now, the in-memory Cmm data structure—generated programmatically > (e.g., from STG via GHC)—can be pretty-printed, and Cmm can also be > parsed. However, the pretty-printed version is not compatible with the > parser. That is, we cannot take the output of the pretty printer and > feed it directly back into the parser. > > Example: > > Parseable version: > > |sum { cr: bits64 x; x = R1 + R2; R1 = x; jump %ENTRY_CODE(Sp(0))[R1]; } | > > Pretty-printed version: > > |sum() { // [] { info_tbls: [] stack_info: arg_space: 8 } {offset cf: > // global _ce::I64 = R1 + R2; R1 = _ce::I64; call (I64[Sp + 0 * > 8])(R1) args: 8, res: 0, upd: 8; } } | > > Another example: > > Parseable version: > > |simple_sum_4 { // [R2, R1] cr: // global bits64 _cq; _cq = R2; bits64 > _cp; _cp = R1; R1 = _cq + _cp; jump (bits64[Sp])[R1]; } | > > Pretty-printed version: > > |simple_sum_4() { // [] { info_tbls: [] stack_info: arg_space: 8 } > {offset cs: // global _cq::I64 = R2; _cr::I64 = R1; R1 = _cq::I64 + > _cr::I64; call (I64[Sp])(R1) args: 8, res: 0, upd: 8; } } | > > While it’s possible to write parseable Cmm that resembles the > pretty-printed version (and hence the internal ADT), they don’t fully > match—mainly because the parser inserts inferred fields using > convenience functions. > > Proposal: > > To make roundtrip serialization possible, I propose supporting a new > syntax that matches the pretty printer output exactly. > > There are a couple of design options: > > 1. > > Create a separate parser that accepts the pretty-printed syntax. > Files could then use either the current parser or the new strict one. > > 2. > > Extend the current parser with a dedicated block syntax like: > > |low_level_unwrapped { ... } | > > This second option is the one my mentor recommends, as it may better > reflect GHC developers' preferences. In this mode, the parser would > not insert any inferred data and would expect the input to match the > pretty-printed form exactly. > > This would enable a true roundtrip: > > * > > Compile Haskell to Cmm (in-memory AST) > > * > > Pretty-print and write it to disk (wrapped in low_level_unwrapped > { ... }) > > * > > Later read it back using the parser and continue with codegen > > Optional future direction: > > As a side note: currently the parser has both a “high-level” and a > “low-level” mode. The low-level mode resembles the AST more closely > but still inserts some inferred data. > > If we introduce this new “exact” low-level form, it's possible the > existing low-level mode could become redundant. We might then have: > > * > > High-level syntax > > * > > New low-level (exact) > > * > > And possibly deprecate the current low-level variant > > I’d be interested in your thoughts on whether that direction makes sense. > > Serialization libraries? > > One technically possible—but likely unacceptable—alternative would be > to derive serialization via a library like |aeson|. That would enable > serializing and deserializing the Cmm AST directly. However, I > understand that |aeson| adds a large dependency footprint, and likely > wouldn't be suitable for inclusion in GHC. > > Final question: > > Lastly—I’ve heard that parts of the Cmm pipeline may currently be > under refactoring. If that’s the case, could you point me to which > parts (parser, pretty printer, internal representation, etc.) are > being modified? I’d like to align my efforts accordingly and avoid > conflicts. > > Thanks very much for your time and input! I'm happy to iterate on this > based on your feedback. > > Best regards, > Diego Antonio Rosario Palomino > GSoC 2025 – Cmm Documentation & Tooling > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.peytonjones at gmail.com Mon Jul 28 08:38:51 2025 From: simon.peytonjones at gmail.com (Simon Peyton Jones) Date: Mon, 28 Jul 2025 09:38:51 +0100 Subject: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) In-Reply-To: References: Message-ID: Diego Like Andreas says, in general being able to parse the output that GHC itself produces would be a good idea. A few thoughts - Do you have any use-cases in mind? Suppose you were 100% successful -- would anyone use it? - - You need a compelling reason to change the input language (understood by the parser) since libraries may include .cmm files, which will break. (It'd be interesting to audit Hackage to see how many libraries do include such .cmm files.) - Rather than change the language understood by the parser, would it not be easier to change the language spat out by the pretty-printer to be compatible with the parser? Simon On Mon, 28 Jul 2025 at 08:46, Andreas Klebinger via ghc-devs < ghc-devs at haskell.org> wrote: > The idea of making Cmm roundtripable comes up every now and then. > While the ability to feed dump output to GHC for debugging or similar > purposes is useful In the end we always > ended up prioritizing one of the many other things that needed doing. > > Or in other words making Cmm (more) roundtripable seems inherently useful. > However it's questionably how much it is worth breaking things like .cmm > code that exists in libraries for it. > So if you want to work towards this it should be with the goal to avoid > breakage. > > There are likely also a lot of corner cases to consider. Which might make > this more complicated then it sounds. > Ultimately this is up to you and your mentor. But if I understand > correctly you have about 5 weeks left for > GSoC so getting full Cmm roundtrip ability into a state where it can be > merged into GHC during that time might be > too optimistic depending on your haskell/parser/GHC experience. > > As a GHC maintainer for us the most useful thing therefore would be > incremental patches which take Cmm closer > to being roundtripable. And that would allow you to get at least some work > that benefits the GHC project into the tree even if you end up not making > it all the way to full roundtrip capability. > > On the pure technical aspects: > ------------- > > > Create a separate parser ... > > > 1. Creating a separate parser is not viable. It would likely bitrot and > break on the next change to Cmm and only causes increased maintenance > overhead. At least not if you want the GHC team to maintain it. > > Extend the current parser with a dedicated block > > Having blocks ala C seems fine. Your suggestion seems different however. > It's unclear from your example how those blocks would work exactly. Is ` > low_level_unwrapped` a label. If so can we goto to it? Is it a keyword? > Something else entirely? > > If the main issue is the "offset" string in the generated case I'm fine > with deleting that from the pretty printer. I'm not sure that does anything > of value so removing it from the output seems fine. (See pprCmmGraph). > > > If we introduce this new “exact” low-level form, it's possible the > existing low-level mode could become redundant. We might then have: > > What changes are you planning that make the new parser/syntax incompatible > with the old one? Can't you just modify the current parser, maybe with some > slight changes to the pretty printer, in a way that makes it mostly > backwards compatible? > > > aeson adds a large dependency footprint, and likely wouldn't be > suitable for inclusion in GHC. > > Yes aeson seems unsuitable. > > > Lastly—I’ve heard that parts of the Cmm pipeline may currently be under > refactoring. > > This is the first time I hear of this so I wonder where this information > came from? There could always be changes to those sorts of things, because > at the end of the day they are compiler internals. But I'm not aware of any > big planned changes in the near future. > > Cheers > Andreas > On 28/07/2025 02:16, Diego Antonio Rosario Palomino wrote: > > Hello GHC devs, > > I'm currently working on Cmm documentation and tooling improvements as > part of my Google Summer of Code project. One of my core goals is to make > Cmm roundtrip serializable. > > Right now, the in-memory Cmm data structure—generated programmatically > (e.g., from STG via GHC)—can be pretty-printed, and Cmm can also be parsed. > However, the pretty-printed version is not compatible with the parser. That > is, we cannot take the output of the pretty printer and feed it directly > back into the parser. > > Example: > > Parseable version: > > sum { > cr: > bits64 x; > x = R1 + R2; > R1 = x; > jump %ENTRY_CODE(Sp(0))[R1]; > } > > Pretty-printed version: > > sum() { // [] > { info_tbls: [] > stack_info: arg_space: 8 > } > {offset > cf: // global > _ce::I64 = R1 + R2; > R1 = _ce::I64; > call (I64[Sp + 0 * 8])(R1) args: 8, res: 0, upd: 8; > } > } > > Another example: > > Parseable version: > > simple_sum_4 { // [R2, R1] > cr: // global > bits64 _cq; > _cq = R2; > bits64 _cp; > _cp = R1; > R1 = _cq + _cp; > jump (bits64[Sp])[R1]; > } > > Pretty-printed version: > > simple_sum_4() { // [] > { info_tbls: [] > stack_info: arg_space: 8 > } > {offset > cs: // global > _cq::I64 = R2; > _cr::I64 = R1; > R1 = _cq::I64 + _cr::I64; > call (I64[Sp])(R1) args: 8, res: 0, upd: 8; > } > } > > While it’s possible to write parseable Cmm that resembles the > pretty-printed version (and hence the internal ADT), they don’t fully > match—mainly because the parser inserts inferred fields using convenience > functions. > > Proposal: > > To make roundtrip serialization possible, I propose supporting a new > syntax that matches the pretty printer output exactly. > > There are a couple of design options: > > 1. > > Create a separate parser that accepts the pretty-printed syntax. Files > could then use either the current parser or the new strict one. > 2. > > Extend the current parser with a dedicated block syntax like: > > low_level_unwrapped { > ... > } > > This second option is the one my mentor recommends, as it may better > reflect GHC developers' preferences. In this mode, the parser would not > insert any inferred data and would expect the input to match the > pretty-printed form exactly. > > This would enable a true roundtrip: > > - > > Compile Haskell to Cmm (in-memory AST) > - > > Pretty-print and write it to disk (wrapped in low_level_unwrapped { > ... }) > - > > Later read it back using the parser and continue with codegen > > Optional future direction: > > As a side note: currently the parser has both a “high-level” and a > “low-level” mode. The low-level mode resembles the AST more closely but > still inserts some inferred data. > > If we introduce this new “exact” low-level form, it's possible the > existing low-level mode could become redundant. We might then have: > > - > > High-level syntax > - > > New low-level (exact) > - > > And possibly deprecate the current low-level variant > > I’d be interested in your thoughts on whether that direction makes sense. > > Serialization libraries? > > One technically possible—but likely unacceptable—alternative would be to > derive serialization via a library like aeson. That would enable > serializing and deserializing the Cmm AST directly. However, I understand > that aeson adds a large dependency footprint, and likely wouldn't be > suitable for inclusion in GHC. > > Final question: > > Lastly—I’ve heard that parts of the Cmm pipeline may currently be under > refactoring. If that’s the case, could you point me to which parts (parser, > pretty printer, internal representation, etc.) are being modified? I’d like > to align my efforts accordingly and avoid conflicts. > > Thanks very much for your time and input! I'm happy to iterate on this > based on your feedback. > > Best regards, > Diego Antonio Rosario Palomino > GSoC 2025 – Cmm Documentation & Tooling > > _______________________________________________ > ghc-devs mailing listghc-devs at haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hecate at glitchbra.in Mon Jul 28 16:03:53 2025 From: hecate at glitchbra.in (=?UTF-8?Q?H=C3=A9cate?=) Date: Mon, 28 Jul 2025 18:03:53 +0200 Subject: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) In-Reply-To: References: Message-ID: <5a92bd6d-23b7-44db-977d-35b21c2047a9@glitchbra.in> Hi Diego, Thank you very much for your work in this direction, it's sorely needed. I'm all for having proper roundtrip correctness for Cmm, but I am not sure altering the parser is the way to go. In my opinion, GHC should produce valid textual Cmm, that can be ingested by the parser at it is today. Have a nice day, Hécate Le 28/07/2025 à 02:16, Diego Antonio Rosario Palomino a écrit : > > Hello GHC devs, > > I'm currently working on Cmm documentation and tooling improvements as > part of my Google Summer of Code project. One of my core goals is to > make Cmm roundtrip serializable. > > Right now, the in-memory Cmm data structure—generated programmatically > (e.g., from STG via GHC)—can be pretty-printed, and Cmm can also be > parsed. However, the pretty-printed version is not compatible with the > parser. That is, we cannot take the output of the pretty printer and > feed it directly back into the parser. > > Example: > > Parseable version: > > |sum { cr: bits64 x; x = R1 + R2; R1 = x; jump %ENTRY_CODE(Sp(0))[R1]; } | > > Pretty-printed version: > > |sum() { // [] { info_tbls: [] stack_info: arg_space: 8 } {offset cf: > // global _ce::I64 = R1 + R2; R1 = _ce::I64; call (I64[Sp + 0 * > 8])(R1) args: 8, res: 0, upd: 8; } } | > > Another example: > > Parseable version: > > |simple_sum_4 { // [R2, R1] cr: // global bits64 _cq; _cq = R2; bits64 > _cp; _cp = R1; R1 = _cq + _cp; jump (bits64[Sp])[R1]; } | > > Pretty-printed version: > > |simple_sum_4() { // [] { info_tbls: [] stack_info: arg_space: 8 } > {offset cs: // global _cq::I64 = R2; _cr::I64 = R1; R1 = _cq::I64 + > _cr::I64; call (I64[Sp])(R1) args: 8, res: 0, upd: 8; } } | > > While it’s possible to write parseable Cmm that resembles the > pretty-printed version (and hence the internal ADT), they don’t fully > match—mainly because the parser inserts inferred fields using > convenience functions. > > Proposal: > > To make roundtrip serialization possible, I propose supporting a new > syntax that matches the pretty printer output exactly. > > There are a couple of design options: > > 1. > > Create a separate parser that accepts the pretty-printed syntax. > Files could then use either the current parser or the new strict one. > > 2. > > Extend the current parser with a dedicated block syntax like: > > |low_level_unwrapped { ... } | > > This second option is the one my mentor recommends, as it may better > reflect GHC developers' preferences. In this mode, the parser would > not insert any inferred data and would expect the input to match the > pretty-printed form exactly. > > This would enable a true roundtrip: > > * > > Compile Haskell to Cmm (in-memory AST) > > * > > Pretty-print and write it to disk (wrapped in low_level_unwrapped > { ... }) > > * > > Later read it back using the parser and continue with codegen > > Optional future direction: > > As a side note: currently the parser has both a “high-level” and a > “low-level” mode. The low-level mode resembles the AST more closely > but still inserts some inferred data. > > If we introduce this new “exact” low-level form, it's possible the > existing low-level mode could become redundant. We might then have: > > * > > High-level syntax > > * > > New low-level (exact) > > * > > And possibly deprecate the current low-level variant > > I’d be interested in your thoughts on whether that direction makes sense. > > Serialization libraries? > > One technically possible—but likely unacceptable—alternative would be > to derive serialization via a library like |aeson|. That would enable > serializing and deserializing the Cmm AST directly. However, I > understand that |aeson| adds a large dependency footprint, and likely > wouldn't be suitable for inclusion in GHC. > > Final question: > > Lastly—I’ve heard that parts of the Cmm pipeline may currently be > under refactoring. If that’s the case, could you point me to which > parts (parser, pretty printer, internal representation, etc.) are > being modified? I’d like to align my efforts accordingly and avoid > conflicts. > > Thanks very much for your time and input! I'm happy to iterate on this > based on your feedback. > > Best regards, > Diego Antonio Rosario Palomino > GSoC 2025 – Cmm Documentation & Tooling > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW:https://glitchbra.in RUN: BSD -------------- next part -------------- An HTML attachment was scrubbed... URL: From diegorosario2013 at gmail.com Mon Jul 28 18:26:31 2025 From: diegorosario2013 at gmail.com (Diego Antonio Rosario Palomino) Date: Mon, 28 Jul 2025 13:26:31 -0500 Subject: Fwd: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) In-Reply-To: References: <5a92bd6d-23b7-44db-977d-35b21c2047a9@glitchbra.in> Message-ID: ---------- Forwarded message --------- De: Diego Antonio Rosario Palomino Date: lun, 28 jul 2025 a la(s) 12:56 p.m. Subject: Re: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) To: Hécate Hello all, Thank you for the thoughtful responses so far, and thank you Simon for summarizing Andreas's comments. *"Do you have any use-cases in mind? Suppose you were 100% successful — would anyone use it?"* Yes — my mentor, *Csaba Hruska*, would. He's currently working on a custom STG optimizer that uses experimental techniques to enable whole-program optimizations for Haskell code. The intended pipeline is: *GHC STG → custom optimizer → textual Cmm → code generation* However, the current *parseable* Cmm is not sufficient for his use case, because it *cannot represent everything the Cmm AST can express*. Beyond this specific use case, achieving *roundtrip serializability* for Cmm could make it a *viable alternative to LLVM* for Haskell projects. Native code generation via Cmm is much faster than through LLVM. And while outputting LLVM from Cmm currently produces *less performant* code than directly targetting LLVM, I believe the inefficiencies could be fixed relatively easily. Enabling such improvements is part of the motivation for my documentation work — to help developers understand and work with Cmm and its infrastructure. *"You need a compelling reason to change the input language (understood by the parser) since libraries may include .cmm files, which will break. (It'd be interesting to audit Hackage to see how many libraries do include such .cmm files.)"* To clarify, this proposal would *not* break backwards compatibility. There are two implementation paths: 1. Introduce a *second parser* that accepts a syntax 100% identical to the pretty printer output. 2. Extend the *current parser* by adding a mode (or block) that uses a distinct keyword (e.g., low_level_unwrapped) to indicate: "expect exact syntax, no convenience fills." In either case, existing .cmm files would continue to be supported as-is. The current parser wouldn't need features removed or changed — the new syntax would *only add capabilities*. *"It’s unclear from your example how those blocks would work exactly. Is low_level_unwrapped a label? If so can we goto it? Is it a keyword? Something else entirely?"* — Andreas Apologies for the confusion — I’m not well-versed in the formal terminology. To clarify: low_level_unwrapped (or very_low_level, or another name) would be a *keyword or syntactic construct* that tells the parser to interpret the contents of the block { ... } using a syntax *identical to what the pretty printer emits*. For example: function1 { } // existing low-level syntax function2() { } // existing high-level syntax very_low_level { ... } // new mode: code with exact pretty-printed syntax inside the block *"Rather than change the language understood by the parser, would it not be easier to change the language spat out by the pretty-printer to be compatible with the parser?"* Unfortunately, that’s not a practical path forward. At the start of the project, Csaba (my mentor) recommended leaving the parser mostly untouched and focusing instead on extending the pretty printer. However, we’ve realized that the differences between the parser and the pretty printer are not trivial. The parser — even in its current “low-level” mode — *inserts inferred data* via convenience functions. It *abstracts part of the structure*, meaning we cannot fully recover the original Cmm ADT just by parsing. In other words, *modifying the pretty printer to match the parser would require it to lose information* — which I strongly oppose. If Cmm is generated programmatically, the pretty-printed version would lack structural information present in the internal data structure. And parseable Cmm would still be *incapable of expressing all features of the AST*. I hope that also addresses your concern, Hécate. This GSoC project runs until *November 10th*. I was granted extra time since, unlike most participants, I’m not working through summer vacation — I’m in the Southern Hemisphere. (Also, I realize I previously used the wrong project name in this thread — the correct title of my GSoC project is *“Documenting and improving Cmm.”*) Regarding the risk of *bitrot* in a new parser or new syntax mode: one possible mitigation would be to add *regression tests* that check whether parsing a file and pretty-printing it results in compatible output. On a related note, I’ve noticed that *some Cmm examples in the documentation and even in source code comments are incorrect or outdated*. Part of my work includes identifying and correcting these inconsistencies. Thanks again to everyone for your time and input — I greatly appreciate the discussion and feedback. Best regards, *Diego Antonio Rosario Palomino* GSoC 2025 – Documenting and improving Cmm El lun, 28 jul 2025 a la(s) 11:04 a.m., Hécate via ghc-devs ( ghc-devs at haskell.org) escribió: > Hi Diego, > > Thank you very much for your work in this direction, it's sorely needed. > > I'm all for having proper roundtrip correctness for Cmm, but I am not sure > altering the parser is the way to go. > In my opinion, GHC should produce valid textual Cmm, that can be ingested > by the parser at it is today. > > Have a nice day, > Hécate > Le 28/07/2025 à 02:16, Diego Antonio Rosario Palomino a écrit : > > Hello GHC devs, > > I'm currently working on Cmm documentation and tooling improvements as > part of my Google Summer of Code project. One of my core goals is to make > Cmm roundtrip serializable. > > Right now, the in-memory Cmm data structure—generated programmatically > (e.g., from STG via GHC)—can be pretty-printed, and Cmm can also be parsed. > However, the pretty-printed version is not compatible with the parser. That > is, we cannot take the output of the pretty printer and feed it directly > back into the parser. > > Example: > > Parseable version: > > sum { > cr: > bits64 x; > x = R1 + R2; > R1 = x; > jump %ENTRY_CODE(Sp(0))[R1]; > } > > Pretty-printed version: > > sum() { // [] > { info_tbls: [] > stack_info: arg_space: 8 > } > {offset > cf: // global > _ce::I64 = R1 + R2; > R1 = _ce::I64; > call (I64[Sp + 0 * 8])(R1) args: 8, res: 0, upd: 8; > } > } > > Another example: > > Parseable version: > > simple_sum_4 { // [R2, R1] > cr: // global > bits64 _cq; > _cq = R2; > bits64 _cp; > _cp = R1; > R1 = _cq + _cp; > jump (bits64[Sp])[R1]; > } > > Pretty-printed version: > > simple_sum_4() { // [] > { info_tbls: [] > stack_info: arg_space: 8 > } > {offset > cs: // global > _cq::I64 = R2; > _cr::I64 = R1; > R1 = _cq::I64 + _cr::I64; > call (I64[Sp])(R1) args: 8, res: 0, upd: 8; > } > } > > While it’s possible to write parseable Cmm that resembles the > pretty-printed version (and hence the internal ADT), they don’t fully > match—mainly because the parser inserts inferred fields using convenience > functions. > > Proposal: > > To make roundtrip serialization possible, I propose supporting a new > syntax that matches the pretty printer output exactly. > > There are a couple of design options: > > 1. > > Create a separate parser that accepts the pretty-printed syntax. Files > could then use either the current parser or the new strict one. > 2. > > Extend the current parser with a dedicated block syntax like: > > low_level_unwrapped { > ... > } > > This second option is the one my mentor recommends, as it may better > reflect GHC developers' preferences. In this mode, the parser would not > insert any inferred data and would expect the input to match the > pretty-printed form exactly. > > This would enable a true roundtrip: > > - > > Compile Haskell to Cmm (in-memory AST) > - > > Pretty-print and write it to disk (wrapped in low_level_unwrapped { > ... }) > - > > Later read it back using the parser and continue with codegen > > Optional future direction: > > As a side note: currently the parser has both a “high-level” and a > “low-level” mode. The low-level mode resembles the AST more closely but > still inserts some inferred data. > > If we introduce this new “exact” low-level form, it's possible the > existing low-level mode could become redundant. We might then have: > > - > > High-level syntax > - > > New low-level (exact) > - > > And possibly deprecate the current low-level variant > > I’d be interested in your thoughts on whether that direction makes sense. > > Serialization libraries? > > One technically possible—but likely unacceptable—alternative would be to > derive serialization via a library like aeson. That would enable > serializing and deserializing the Cmm AST directly. However, I understand > that aeson adds a large dependency footprint, and likely wouldn't be > suitable for inclusion in GHC. > > Final question: > > Lastly—I’ve heard that parts of the Cmm pipeline may currently be under > refactoring. If that’s the case, could you point me to which parts (parser, > pretty printer, internal representation, etc.) are being modified? I’d like > to align my efforts accordingly and avoid conflicts. > > Thanks very much for your time and input! I'm happy to iterate on this > based on your feedback. > > Best regards, > Diego Antonio Rosario Palomino > GSoC 2025 – Cmm Documentation & Tooling > > _______________________________________________ > ghc-devs mailing listghc-devs at haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -- > Hécate ✨ > 🐦: @TechnoEmpress > IRC: Hecate > WWW: https://glitchbra.in > RUN: BSD > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben at smart-cactus.org Mon Jul 28 18:55:19 2025 From: ben at smart-cactus.org (Ben Gamari) Date: Mon, 28 Jul 2025 14:55:19 -0400 Subject: Fwd: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) In-Reply-To: References: <5a92bd6d-23b7-44db-977d-35b21c2047a9@glitchbra.in> Message-ID: <87ecu0b0x8.fsf@smart-cactus.org> Diego Antonio Rosario Palomino writes: > ---------- Forwarded message --------- > De: Diego Antonio Rosario Palomino > Date: lun, 28 jul 2025 a la(s) 12:56 p.m. > Subject: Re: Proposal: Roundtrip serialization of Cmm (parser-compatible > pretty-printer output) > To: Hécate > > > Hello all, > > Thank you for the thoughtful responses so far, and thank you Simon for > summarizing Andreas's comments. > Hi Diego, In the future it would make things easier if you could use one of the common email quoting conventions (i.e. starting lines with >). It is otherwise a bit hard to distinguish your replies from the questions you are responding to. > > *"Do you have any use-cases in mind? Suppose you were 100% successful — > > would anyone use it?"* > > Yes — my mentor, *Csaba Hruska*, would. He's currently working on a custom > STG optimizer that uses experimental techniques to enable whole-program > optimizations for Haskell code. The intended pipeline is: > > > *GHC STG → custom optimizer → textual Cmm → code generation* > > However, the current *parseable* Cmm is not sufficient for his use case, > because it *cannot represent everything the Cmm AST can express*. > > Beyond this specific use case, achieving *roundtrip serializability* for > Cmm could make it a *viable alternative to LLVM* for Haskell projects. > Native code generation via Cmm is much faster than through LLVM. And while > outputting LLVM from Cmm currently produces *less performant* code than > directly targetting LLVM, I believe the inefficiencies could be fixed > relatively easily. Enabling such improvements is part of the motivation for > my documentation work — to help developers understand and work with Cmm and > its infrastructure. > > > *"You need a compelling reason to change the input language (understood by > > the parser) since libraries may include .cmm files, which will break. (It'd > > be interesting to audit Hackage to see how many libraries do include such > > .cmm files.)"* > > To clarify, this proposal would *not* break backwards compatibility. There > are two implementation paths: > > 1. Introduce a *second parser* that accepts a syntax 100% identical to the > pretty printer output. > > 2. Extend the *current parser* by adding a mode (or block) that uses a > distinct keyword (e.g., low_level_unwrapped) to indicate: "expect exact > syntax, no convenience fills." > > In either case, existing .cmm files would continue to be supported as-is. > The current parser wouldn't need features removed or changed — the new > syntax would *only add capabilities*. > Duplicating the parser seems like a very heavy cost to pay here. Do we have a concrete list of places where the parsed grammar differs from that which is produced? I feel it might be useful to get a sense of how much divergence there is before we entertain such drastic steps. > > *"It’s unclear from your example how those blocks would work exactly. Is > > low_level_unwrapped a label? If so can we goto it? Is it a keyword? > > Something else entirely?"* — Andreas > > Apologies for the confusion — I’m not well-versed in the formal terminology. > > To clarify: low_level_unwrapped (or very_low_level, or another name) would > be a *keyword or syntactic construct* that tells the parser to interpret > the contents of the block { ... } using a syntax *identical to what the > pretty printer emits*. > > For example: > > function1 { } // existing low-level syntax > function2() { } // existing high-level syntax > > very_low_level { ... } // new mode: code with exact pretty-printed > syntax inside the block > > > *"Rather than change the language understood by the parser, would it not be > > easier to change the language spat out by the pretty-printer to be > > compatible with the parser?"* > > Unfortunately, that’s not a practical path forward. > > At the start of the project, Csaba (my mentor) recommended leaving the > parser mostly untouched and focusing instead on extending the pretty > printer. However, we’ve realized that the differences between the parser > and the pretty printer are not trivial. The parser — even in its current > “low-level” mode — *inserts inferred data* via convenience functions. > It *abstracts part of the structure*, meaning we cannot fully recover > the original Cmm ADT just by parsing. > Sure, but instead of adding a whole new branch to the grammar, why don't we start by enumerating the specific places where the Cmm parser elaborates. We can then introduce specific productions to allow expression of those particular cases. Ideally the existing productions would be special cases of the new, more expressive productions. > In other words, *modifying the pretty printer to match the parser would > require it to lose information* — which I strongly oppose. If Cmm is > generated programmatically, the pretty-printed version would lack > structural information present in the internal data structure. And > parseable Cmm would still be *incapable of expressing all features of the > AST*. > > I hope that also addresses your concern, Hécate. > > This GSoC project runs until *November 10th*. I was granted extra time > since, unlike most participants, I’m not working through summer vacation — > I’m in the Southern Hemisphere. > > (Also, I realize I previously used the wrong project name in this thread — > the correct title of my GSoC project is *“Documenting and improving Cmm.”*) > > Regarding the risk of *bitrot* in a new parser or new syntax mode: one > possible mitigation would be to add *regression tests* that check whether > parsing a file and pretty-printing it results in compatible output. > Yes, this would alert us of some cases of bitrot (specifically, those cases that we think to test, although that set can be very large with property testing). Nevertheless, fixing it still requires effort and maintenance effort is something that we must weigh. > On a related note, I’ve noticed that *some Cmm examples in the > documentation and even in source code comments are incorrect or outdated*. > Part of my work includes identifying and correcting these inconsistencies. That is great. Do open merge requests as you find these. It would be great to get these into the tree now rather than build up a large backlog for review at the end of the project. Cheers, - Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 255 bytes Desc: not available URL: From hecate at glitchbra.in Mon Jul 28 21:30:47 2025 From: hecate at glitchbra.in (=?UTF-8?Q?H=C3=A9cate?=) Date: Mon, 28 Jul 2025 23:30:47 +0200 Subject: Fwd: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) In-Reply-To: References: <5a92bd6d-23b7-44db-977d-35b21c2047a9@glitchbra.in> Message-ID: <34b02fa9-e7e4-4e17-ae99-011cf736b184@glitchbra.in> Thanks a lot Diego, that indeed addresses my concerns. :) Le 28/07/2025 à 20:26, Diego Antonio Rosario Palomino a écrit : > > > ---------- Forwarded message --------- > De: *Diego Antonio Rosario Palomino* > Date: lun, 28 jul 2025 a la(s) 12:56 p.m. > Subject: Re: Proposal: Roundtrip serialization of Cmm > (parser-compatible pretty-printer output) > To: Hécate > > > Hello all, > > Thank you for the thoughtful responses so far, and thank you Simon for > summarizing Andreas's comments. > > /"Do you have any use-cases in mind? Suppose you were 100% > successful — would anyone use it?"/ > > Yes — my mentor, *Csaba Hruska*, would. He's currently working on a > custom STG optimizer that uses experimental techniques to enable > whole-program optimizations for Haskell code. The intended pipeline is: > > *GHC STG → custom optimizer → textual Cmm → code generation* > > However, the current /parseable/ Cmm is not sufficient for his use > case, because it *cannot represent everything the Cmm AST can express*. > > Beyond this specific use case, achieving *roundtrip serializability* > for Cmm could make it a *viable alternative to LLVM* for Haskell > projects. Native code generation via Cmm is much faster than through > LLVM. And while outputting LLVM from Cmm currently produces /less > performant/ code than directly targetting LLVM, I believe the > inefficiencies could be fixed relatively easily. Enabling such > improvements is part of the motivation for my documentation work — to > help developers understand and work with Cmm and its infrastructure. > > /"You need a compelling reason to change the input language > (understood by the parser) since libraries may include .cmm files, > which will break. (It'd be interesting to audit Hackage to see how > many libraries do include such .cmm files.)"/ > > To clarify, this proposal would *not* break backwards compatibility. > There are two implementation paths: > > 1. > > Introduce a *second parser* that accepts a syntax 100% identical > to the pretty printer output. > > 2. > > Extend the *current parser* by adding a mode (or block) that uses > a distinct keyword (e.g., |low_level_unwrapped|) to indicate: > "expect exact syntax, no convenience fills." > > In either case, existing |.cmm| files would continue to be supported > as-is. The current parser wouldn't need features removed or changed — > the new syntax would *only add capabilities*. > > /"It’s unclear from your example how those blocks would work > exactly. Is |low_level_unwrapped| a label? If so can we |goto| it? > Is it a keyword? Something else entirely?"/ — Andreas > > Apologies for the confusion — I’m not well-versed in the formal > terminology. > > To clarify: |low_level_unwrapped| (or |very_low_level|, or another > name) would be a *keyword or syntactic construct* that tells the > parser to interpret the contents of the block |{ ... }| using a syntax > *identical to what the pretty printer emits*. > > For example: > > |function1 { } // existing low-level syntax function2() { } // > existing high-level syntax very_low_level { ... } // new mode: code > with exact pretty-printed syntax inside the block | > > /"Rather than change the language understood by the parser, would > it not be easier to change the language spat out by the > pretty-printer to be compatible with the parser?"/ > > Unfortunately, that’s not a practical path forward. > > At the start of the project, Csaba (my mentor) recommended leaving the > parser mostly untouched and focusing instead on extending the pretty > printer. However, we’ve realized that the differences between the > parser and the pretty printer are not trivial. The parser — even in > its current “low-level” mode — *inserts inferred data* via convenience > functions. It *abstracts part of the structure*, meaning we cannot > fully recover the original Cmm ADT just by parsing. > > In other words, *modifying the pretty printer to match the parser > would require it to /lose information/* — which I strongly oppose. If > Cmm is generated programmatically, the pretty-printed version would > lack structural information present in the internal data structure. > And parseable Cmm would still be *incapable of expressing all features > of the AST*. > > I hope that also addresses your concern, Hécate. > > This GSoC project runs until *November 10th*. I was granted extra time > since, unlike most participants, I’m not working through summer > vacation — I’m in the Southern Hemisphere. > > (Also, I realize I previously used the wrong project name in this > thread — the correct title of my GSoC project is *“Documenting and > improving Cmm.”*) > > Regarding the risk of *bitrot* in a new parser or new syntax mode: one > possible mitigation would be to add *regression tests* that check > whether parsing a file and pretty-printing it results in compatible > output. > > On a related note, I’ve noticed that *some Cmm examples in the > documentation and even in source code comments are incorrect or > outdated*. Part of my work includes identifying and correcting these > inconsistencies. > > Thanks again to everyone for your time and input — I greatly > appreciate the discussion and feedback. > > Best regards, > *Diego Antonio Rosario Palomino* > GSoC 2025 – Documenting and improving Cmm > > > El lun, 28 jul 2025 a la(s) 11:04 a.m., Hécate via ghc-devs > (ghc-devs at haskell.org) escribió: > > Hi Diego, > > Thank you very much for your work in this direction, it's sorely > needed. > > I'm all for having proper roundtrip correctness for Cmm, but I am > not sure altering the parser is the way to go. > In my opinion, GHC should produce valid textual Cmm, that can be > ingested by the parser at it is today. > > Have a nice day, > Hécate > > Le 28/07/2025 à 02:16, Diego Antonio Rosario Palomino a écrit : >> >> Hello GHC devs, >> >> I'm currently working on Cmm documentation and tooling >> improvements as part of my Google Summer of Code project. One of >> my core goals is to make Cmm roundtrip serializable. >> >> Right now, the in-memory Cmm data structure—generated >> programmatically (e.g., from STG via GHC)—can be pretty-printed, >> and Cmm can also be parsed. However, the pretty-printed version >> is not compatible with the parser. That is, we cannot take the >> output of the pretty printer and feed it directly back into the >> parser. >> >> Example: >> >> Parseable version: >> >> |sum { cr: bits64 x; x = R1 + R2; R1 = x; jump >> %ENTRY_CODE(Sp(0))[R1]; } | >> >> Pretty-printed version: >> >> |sum() { // [] { info_tbls: [] stack_info: arg_space: 8 } {offset >> cf: // global _ce::I64 = R1 + R2; R1 = _ce::I64; call (I64[Sp + 0 >> * 8])(R1) args: 8, res: 0, upd: 8; } } | >> >> Another example: >> >> Parseable version: >> >> |simple_sum_4 { // [R2, R1] cr: // global bits64 _cq; _cq = R2; >> bits64 _cp; _cp = R1; R1 = _cq + _cp; jump (bits64[Sp])[R1]; } | >> >> Pretty-printed version: >> >> |simple_sum_4() { // [] { info_tbls: [] stack_info: arg_space: 8 >> } {offset cs: // global _cq::I64 = R2; _cr::I64 = R1; R1 = >> _cq::I64 + _cr::I64; call (I64[Sp])(R1) args: 8, res: 0, upd: 8; } } | >> >> While it’s possible to write parseable Cmm that resembles the >> pretty-printed version (and hence the internal ADT), they don’t >> fully match—mainly because the parser inserts inferred fields >> using convenience functions. >> >> Proposal: >> >> To make roundtrip serialization possible, I propose supporting a >> new syntax that matches the pretty printer output exactly. >> >> There are a couple of design options: >> >> 1. >> >> Create a separate parser that accepts the pretty-printed >> syntax. Files could then use either the current parser or the >> new strict one. >> >> 2. >> >> Extend the current parser with a dedicated block syntax like: >> >> |low_level_unwrapped { ... } | >> >> This second option is the one my mentor recommends, as it may >> better reflect GHC developers' preferences. In this mode, the >> parser would not insert any inferred data and would expect the >> input to match the pretty-printed form exactly. >> >> This would enable a true roundtrip: >> >> * >> >> Compile Haskell to Cmm (in-memory AST) >> >> * >> >> Pretty-print and write it to disk (wrapped in >> low_level_unwrapped { ... }) >> >> * >> >> Later read it back using the parser and continue with codegen >> >> Optional future direction: >> >> As a side note: currently the parser has both a “high-level” and >> a “low-level” mode. The low-level mode resembles the AST more >> closely but still inserts some inferred data. >> >> If we introduce this new “exact” low-level form, it's possible >> the existing low-level mode could become redundant. We might then >> have: >> >> * >> >> High-level syntax >> >> * >> >> New low-level (exact) >> >> * >> >> And possibly deprecate the current low-level variant >> >> I’d be interested in your thoughts on whether that direction >> makes sense. >> >> Serialization libraries? >> >> One technically possible—but likely unacceptable—alternative >> would be to derive serialization via a library like |aeson|. That >> would enable serializing and deserializing the Cmm AST directly. >> However, I understand that |aeson| adds a large dependency >> footprint, and likely wouldn't be suitable for inclusion in GHC. >> >> Final question: >> >> Lastly—I’ve heard that parts of the Cmm pipeline may currently be >> under refactoring. If that’s the case, could you point me to >> which parts (parser, pretty printer, internal representation, >> etc.) are being modified? I’d like to align my efforts >> accordingly and avoid conflicts. >> >> Thanks very much for your time and input! I'm happy to iterate on >> this based on your feedback. >> >> Best regards, >> Diego Antonio Rosario Palomino >> GSoC 2025 – Cmm Documentation & Tooling >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > -- > Hécate ✨ > 🐦: @TechnoEmpress > IRC: Hecate > WWW:https://glitchbra.in > RUN: BSD > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs -- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW:https://glitchbra.in RUN: BSD -------------- next part -------------- An HTML attachment was scrubbed... URL: From diegorosario2013 at gmail.com Tue Jul 29 00:39:40 2025 From: diegorosario2013 at gmail.com (Diego Antonio Rosario Palomino) Date: Mon, 28 Jul 2025 19:39:40 -0500 Subject: Proposal: Roundtrip serialization of Cmm (parser-compatible pretty-printer output) Message-ID: >Hi Diego, >In the future it would make things easier if you could use one of the >common email quoting conventions (i.e. starting lines with >). It is >otherwise a bit hard to distinguish your replies from the questions >you are responding to. I am sorry, Ben Gamari. I am not used to working in mailing lists. I also messed up the formatting in my last reply. I accidentally created a second topic (but this comment uses the original topic). Please tell me if this comment uses an appropriate format so I can proceed to answer the rest of your reply. Btw, some months back my first topic on this mailing list just linked to the corresponding Discourse post: https://discourse.haskell.org/t/gsoc-2025-documenting-and-improving-cmm/11870/15 Would it be acceptable to use Discourse for my next topic regarding this project? (Maybe with a link here, on this mailing list.) Diego Antonio Rosario Palomino -------------- next part -------------- An HTML attachment was scrubbed... URL: From zubin at well-typed.com Tue Jul 29 03:19:05 2025 From: zubin at well-typed.com (Zubin Duggal) Date: Tue, 29 Jul 2025 08:49:05 +0530 Subject: GHC 9.10.3-rc1 is now available Message-ID: The GHC developers are very pleased to announce the availability of the release candidate for GHC 9.10.3. Binary distributions, source distributions, and documentation are available at [downloads.haskell.org][] and via [GHCup](https://www.haskell.org/ghcup/). GHC 9.10.3 is a bug-fix release fixing over 50 issues of a variety of severities and scopes. A full accounting of these fixes can be found in the [release notes][]. As always, GHC's release status, including planned future releases, can be found on the GHC Wiki [status][]. This release candidate will have a two-week testing period. If all goes well the final release will be available the week of 11 August 2025. We would like to thank Well-Typed, Tweag I/O, Juspay, QBayLogic, Channable, Serokell, SimSpace, the Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release. As always, do give this release a try and open a [ticket][] if you see anything amiss. [release notes]: https://gitlab.haskell.org/ghc/ghc/-/blob/ghc-9.10/docs/users_guide/9.10.3-notes.rst?ref_type=heads&plain=1 [status]: https://gitlab.haskell.org/ghc/ghc/-/wikis/GHC-status [downloads.haskell.org]: https://downloads.haskell.org/ghc/9.10.3-rc1 [ticket]: https://gitlab.haskell.org/ghc/ghc/-/issues/new -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From simon.peytonjones at gmail.com Wed Jul 30 08:14:49 2025 From: simon.peytonjones at gmail.com (Simon Peyton Jones) Date: Wed, 30 Jul 2025 09:14:49 +0100 Subject: GHC/ | Failed pipeline for wip/T23109 | a90f83d4 In-Reply-To: <6889d2541ad9b_2a3739c5bfc342dc@nixos.mail> References: <6889d2541ad9b_2a3739c5bfc342dc@nixos.mail> Message-ID: Dear GHC devs This mr !10479 seems Utterly Broken. Looks as if something is wrong with CI. Can anyone help? https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10479 Thanks Simon On Wed, 30 Jul 2025 at 09:05, GitLab wrote: > [image: GitLab] > [image: ✖] Pipeline #113076 has failed! > > Project Glasgow Haskell Compiler / GHC > > Branch > wip/T23109 > Commit > a90f83d4 > > in !10479 > Fix mergo bugs > Commit Author > Simon Peyton Jones > > Pipeline #113076 > triggered by Simon Peyton Jones > had 7 failed jobs > Failed jobs > [image: ✖] tool-lint lint-submods > > [image: ✖] tool-lint lint-linters > > [image: ✖] tool-lint typecheck-testsuite > > [image: ✖] packaging project-version > > [image: ✖] tool-lint lint-testsuite > > [image: ✖] tool-lint ghc-linters > > [image: ✖] tool-lint lint-author > > [image: GitLab] > You're receiving this email because of your account on gitlab.haskell.org. > Manage all notifications > · Help > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryan at haskell.foundation Wed Jul 30 11:08:08 2025 From: bryan at haskell.foundation (Bryan Richter) Date: Wed, 30 Jul 2025 14:08:08 +0300 Subject: GHC/ | Failed pipeline for wip/T23109 | a90f83d4 In-Reply-To: References: <6889d2541ad9b_2a3739c5bfc342dc@nixos.mail> Message-ID: It looks like most lint jobs are failing because they want to run on a Docker image that is missing from the GitLab registry. I know Ben talked about having some scheme for expiring images. I don't think that has been put into place, however. There could also just be fallout from the server migration, even though that took place a few months ago. I have seen more of these types of failures since then. Since I haven't touched the ci-images repo recently, somebody else will need to look at this. fwiw it does look like a primary victim is the "OpenCape lint runner", as seen here: https://grafana.gitlab.haskell.org/d/167r9v6nk/ci-spurious-failures?orgId=2&from=now-30d&to=now&var-types=$__all&var-runners=OpenCape%20lint%20runner&var-jobs=$__all&refresh=15m&timezone=browser On Wed, 30 Jul 2025 at 11:15, Simon Peyton Jones < simon.peytonjones at gmail.com> wrote: > Dear GHC devs > > This mr !10479 seems Utterly Broken. Looks as if something is wrong with > CI. Can anyone help? > > https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10479 > > Thanks > > Simon > > On Wed, 30 Jul 2025 at 09:05, GitLab wrote: > >> [image: GitLab] >> [image: ✖] Pipeline #113076 has failed! >> >> Project Glasgow Haskell Compiler / GHC >> >> Branch >> wip/T23109 >> Commit >> a90f83d4 >> >> in !10479 >> Fix mergo bugs >> Commit Author >> Simon Peyton Jones >> >> Pipeline #113076 >> triggered by Simon Peyton Jones >> had 7 failed jobs >> Failed jobs >> [image: ✖] tool-lint lint-submods >> >> [image: ✖] tool-lint lint-linters >> >> [image: ✖] tool-lint typecheck-testsuite >> >> [image: ✖] packaging project-version >> >> [image: ✖] tool-lint lint-testsuite >> >> [image: ✖] tool-lint ghc-linters >> >> [image: ✖] tool-lint lint-author >> >> [image: GitLab] >> You're receiving this email because of your account on gitlab.haskell.org. >> Manage all notifications >> · Help >> >> > _______________________________________________ > ghc-devs mailing list > ghc-devs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryan at haskell.foundation Wed Jul 30 11:09:13 2025 From: bryan at haskell.foundation (Bryan Richter) Date: Wed, 30 Jul 2025 14:09:13 +0300 Subject: GHC/ | Failed pipeline for wip/T23109 | a90f83d4 In-Reply-To: References: <6889d2541ad9b_2a3739c5bfc342dc@nixos.mail> Message-ID: I'm loathe to pause a lint runner, but I will do so in this case, to see if the problem is localized to that system. On Wed, 30 Jul 2025 at 14:08, Bryan Richter wrote: > It looks like most lint jobs are failing because they want to run on a > Docker image that is missing from the GitLab registry. I know Ben talked > about having some scheme for expiring images. I don't think that has been > put into place, however. > > There could also just be fallout from the server migration, even though > that took place a few months ago. I have seen more of these types of > failures since then. > > Since I haven't touched the ci-images repo recently, somebody else will > need to look at this. > > fwiw it does look like a primary victim is the "OpenCape lint runner", as > seen here: > https://grafana.gitlab.haskell.org/d/167r9v6nk/ci-spurious-failures?orgId=2&from=now-30d&to=now&var-types=$__all&var-runners=OpenCape%20lint%20runner&var-jobs=$__all&refresh=15m&timezone=browser > > On Wed, 30 Jul 2025 at 11:15, Simon Peyton Jones < > simon.peytonjones at gmail.com> wrote: > >> Dear GHC devs >> >> This mr !10479 seems Utterly Broken. Looks as if something is wrong >> with CI. Can anyone help? >> >> https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10479 >> >> Thanks >> >> Simon >> >> On Wed, 30 Jul 2025 at 09:05, GitLab wrote: >> >>> [image: GitLab] >>> [image: ✖] Pipeline #113076 has failed! >>> >>> Project Glasgow Haskell Compiler / GHC >>> >>> Branch >>> wip/T23109 >>> Commit >>> a90f83d4 >>> >>> in !10479 >>> Fix mergo bugs >>> Commit Author >>> Simon Peyton Jones >>> >>> Pipeline #113076 >>> triggered by Simon Peyton Jones >>> had 7 failed jobs >>> Failed jobs >>> [image: ✖] tool-lint lint-submods >>> >>> [image: ✖] tool-lint lint-linters >>> >>> [image: ✖] tool-lint typecheck-testsuite >>> >>> [image: ✖] packaging project-version >>> >>> [image: ✖] tool-lint lint-testsuite >>> >>> [image: ✖] tool-lint ghc-linters >>> >>> [image: ✖] tool-lint lint-author >>> >>> [image: GitLab] >>> You're receiving this email because of your account on >>> gitlab.haskell.org. Manage all notifications >>> · Help >>> >>> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs at haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryan at haskell.foundation Wed Jul 30 11:11:49 2025 From: bryan at haskell.foundation (Bryan Richter) Date: Wed, 30 Jul 2025 14:11:49 +0300 Subject: GHC/ | Failed pipeline for wip/T23109 | a90f83d4 In-Reply-To: References: <6889d2541ad9b_2a3739c5bfc342dc@nixos.mail> Message-ID: (Sorry for spamming emails) I should also mention that the large number of "spurious failures" on the linked dashboard is, itself, a spurious result. These errors are *not* spurious, but they are being treated as such. That causes up to 10 needless retries per failure. That's something I have on my list to fix. Stll, even after dividing by 10, there are a lot of failures. On Wed, 30 Jul 2025 at 14:09, Bryan Richter wrote: > I'm loathe to pause a lint runner, but I will do so in this case, to see > if the problem is localized to that system. > > On Wed, 30 Jul 2025 at 14:08, Bryan Richter > wrote: > >> It looks like most lint jobs are failing because they want to run on a >> Docker image that is missing from the GitLab registry. I know Ben talked >> about having some scheme for expiring images. I don't think that has been >> put into place, however. >> >> There could also just be fallout from the server migration, even though >> that took place a few months ago. I have seen more of these types of >> failures since then. >> >> Since I haven't touched the ci-images repo recently, somebody else will >> need to look at this. >> >> fwiw it does look like a primary victim is the "OpenCape lint runner", as >> seen here: >> https://grafana.gitlab.haskell.org/d/167r9v6nk/ci-spurious-failures?orgId=2&from=now-30d&to=now&var-types=$__all&var-runners=OpenCape%20lint%20runner&var-jobs=$__all&refresh=15m&timezone=browser >> >> On Wed, 30 Jul 2025 at 11:15, Simon Peyton Jones < >> simon.peytonjones at gmail.com> wrote: >> >>> Dear GHC devs >>> >>> This mr !10479 seems Utterly Broken. Looks as if something is wrong >>> with CI. Can anyone help? >>> >>> https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10479 >>> >>> Thanks >>> >>> Simon >>> >>> On Wed, 30 Jul 2025 at 09:05, GitLab wrote: >>> >>>> [image: GitLab] >>>> [image: ✖] Pipeline #113076 has failed! >>>> >>>> Project Glasgow Haskell Compiler / GHC >>>> >>>> Branch >>>> wip/T23109 >>>> Commit >>>> a90f83d4 >>>> >>>> in !10479 >>>> Fix mergo bugs >>>> Commit Author >>>> Simon Peyton Jones >>>> >>>> Pipeline #113076 >>>> triggered by Simon >>>> Peyton Jones >>>> had 7 failed jobs >>>> Failed jobs >>>> [image: ✖] tool-lint lint-submods >>>> >>>> [image: ✖] tool-lint lint-linters >>>> >>>> [image: ✖] tool-lint typecheck-testsuite >>>> >>>> [image: ✖] packaging project-version >>>> >>>> [image: ✖] tool-lint lint-testsuite >>>> >>>> [image: ✖] tool-lint ghc-linters >>>> >>>> [image: ✖] tool-lint lint-author >>>> >>>> [image: GitLab] >>>> You're receiving this email because of your account on >>>> gitlab.haskell.org. Manage all notifications >>>> · Help >>>> >>>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs at haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From juhpetersen at gmail.com Wed Jul 30 11:21:32 2025 From: juhpetersen at gmail.com (Jens Petersen) Date: Wed, 30 Jul 2025 19:21:32 +0800 Subject: GHC 9.10.3-rc1 is now available In-Reply-To: References: Message-ID: Thanks! I did test builds for Fedora Rawhide and EPEL 10: https://koji.fedoraproject.org/koji/taskinfo?taskID=135425601 (quick rawhide) https://koji.fedoraproject.org/koji/taskinfo?taskID=135469895 (perf rawhide) https://koji.fedoraproject.org/koji/taskinfo?taskID=135480867 (quick epel10) If anyone wants to actually install and test them you can use `koji-tool install ` within 2 weeks. Jens -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.peytonjones at gmail.com Thu Jul 31 14:57:10 2025 From: simon.peytonjones at gmail.com (Simon Peyton Jones) Date: Thu, 31 Jul 2025 15:57:10 +0100 Subject: GHC/ | Failed pipeline for wip/T23109 | a90f83d4 In-Reply-To: References: <6889d2541ad9b_2a3739c5bfc342dc@nixos.mail> Message-ID: It seems to have fixed itself. Works now. Simon On Wed, 30 Jul 2025 at 12:12, Bryan Richter wrote: > (Sorry for spamming emails) > > I should also mention that the large number of "spurious failures" on the > linked dashboard is, itself, a spurious result. These errors are *not* > spurious, but they are being treated as such. That causes up to 10 needless > retries per failure. That's something I have on my list to fix. > > Stll, even after dividing by 10, there are a lot of failures. > > On Wed, 30 Jul 2025 at 14:09, Bryan Richter > wrote: > >> I'm loathe to pause a lint runner, but I will do so in this case, to see >> if the problem is localized to that system. >> >> On Wed, 30 Jul 2025 at 14:08, Bryan Richter >> wrote: >> >>> It looks like most lint jobs are failing because they want to run on a >>> Docker image that is missing from the GitLab registry. I know Ben talked >>> about having some scheme for expiring images. I don't think that has been >>> put into place, however. >>> >>> There could also just be fallout from the server migration, even though >>> that took place a few months ago. I have seen more of these types of >>> failures since then. >>> >>> Since I haven't touched the ci-images repo recently, somebody else will >>> need to look at this. >>> >>> fwiw it does look like a primary victim is the "OpenCape lint runner", >>> as seen here: >>> https://grafana.gitlab.haskell.org/d/167r9v6nk/ci-spurious-failures?orgId=2&from=now-30d&to=now&var-types=$__all&var-runners=OpenCape%20lint%20runner&var-jobs=$__all&refresh=15m&timezone=browser >>> >>> On Wed, 30 Jul 2025 at 11:15, Simon Peyton Jones < >>> simon.peytonjones at gmail.com> wrote: >>> >>>> Dear GHC devs >>>> >>>> This mr !10479 seems Utterly Broken. Looks as if something is wrong >>>> with CI. Can anyone help? >>>> >>>> https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10479 >>>> >>>> Thanks >>>> >>>> Simon >>>> >>>> On Wed, 30 Jul 2025 at 09:05, GitLab wrote: >>>> >>>>> [image: GitLab] >>>>> [image: ✖] Pipeline #113076 has failed! >>>>> >>>>> Project Glasgow Haskell Compiler / GHC >>>>> >>>>> Branch >>>>> wip/T23109 >>>>> Commit >>>>> a90f83d4 >>>>> >>>>> in !10479 >>>>> Fix mergo bugs >>>>> Commit Author >>>>> Simon Peyton Jones >>>>> >>>>> Pipeline #113076 >>>>> triggered by Simon >>>>> Peyton Jones >>>>> had 7 failed jobs >>>>> Failed jobs >>>>> [image: ✖] tool-lint lint-submods >>>>> >>>>> [image: ✖] tool-lint lint-linters >>>>> >>>>> [image: ✖] tool-lint typecheck-testsuite >>>>> >>>>> [image: ✖] packaging project-version >>>>> >>>>> [image: ✖] tool-lint lint-testsuite >>>>> >>>>> [image: ✖] tool-lint ghc-linters >>>>> >>>>> [image: ✖] tool-lint lint-author >>>>> >>>>> [image: GitLab] >>>>> You're receiving this email because of your account on >>>>> gitlab.haskell.org. Manage all notifications >>>>> · Help >>>>> >>>>> >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs at haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: