Questions regarding the RISCV native codegen & performance

Sven Tennie sven.tennie at gmail.com
Fri Apr 18 12:41:28 UTC 2025


Hey Daniel 👋

Thanks a lot for your kind words.

The AArch64 ISA might also be some source of inspiration. AArch64 has some
combined instructions which RISC-V hasn't. E.g ADD of two registers with an
included shift. Though, we don't seem to use many of them and I haven't
found any usage that wouldn't be well covered in RISC-V NCG. Probably,
that's because MachOp (
https://hackage.haskell.org/package/ghc-9.12.1/docs/GHC-Cmm-MachOp.html#t:MachOp)
is pretty fine grained.

A good candidate for investigations could be the CSET pseudo-instruction. I
stumbled over it while looking for pseudo-ops which lead to combined
instructions in AArch64 NCG. The CSET pseudo-op leads to two instructions
in RISC-V NCG and to one in AArch64 NCG:
-
https://gitlab.haskell.org/ghc/ghc/-/blob/386f18548e3c66d04f648a9d34f167a086c1328b/compiler/GHC/CmmToAsm/AArch64/Ppr.hs#L443
-
https://gitlab.haskell.org/ghc/ghc/-/blob/386f18548e3c66d04f648a9d34f167a086c1328b/compiler/GHC/CmmToAsm/RV64/Ppr.hs#L554

Though, this might be a sub-optimal implementation (in this case we'd be
happy to get a ticket ;) ). As CSET is used for comparisons, it should
appear pretty frequently.

A bit off-topic, but for the sake of completeness: The Compiler Explorer
seems to use DWARF symbols to map assembly instructions to Haskell code
lines. At least, it compiles with -g.

VELDT's profile is stated as "RV32I (no FENCE, ECALL, EBREAK)" on their
Github page. But, we target RV64G with both, the NCG and LLVM backends.
(The main reason to not support simpler profiles is that all hardware on
the market that is powerful enough to reasonably run Haskell supports at
least RV64G.)

Thanks for the hint about the J-extension. I will take a look at it.

Enjoy your weekend & best regards,

Sven

Am Fr., 18. Apr. 2025 um 12:13 Uhr schrieb Daniel Trujillo Viedma <
danihacker.viedma at gmail.com>:

> Thank you so much for all the information and the help.
>
>
> Seriously, this is much more than I was hoping to get, even the suggestion
> for generating commented assembly code (which is, I assume, the method that
> Compiler Explorer uses to relate the high-level Haskell code with the
> assembly output of the compiler, which is really nice). And four your
> RISC-V NGC, which I found easy to understand.
>
>
> I guess this is the kind of professionals that Haskell attracts, which is
> a big part of why I love it.
>
>
> I will send here an executive summary of my findings, including statistics
> about a couple of programs that I try. I don't know if I'll be able to do a
> very statistical significant analysis, because I'll still have a lot of
> things to do (to extend QEMU, maybe also gem5, and implement it in a Clash
> microprocessor design, probably VELDT), but maybe in the future I can
> automate more of it and running a more comprehensive analysis. FYI, I have
> found that the RISC-V specs mention Haskell among other languages in a
> still empty J extension section, which will be aimed at helping dynamically
> translated languages as well as garbage-collected, but I guess RISC-V
> people is still more focused on other things and it will take some time to
> start work on that extension.
>
>
> I find also very interesting your suggestion for far-jumping, but I'm
> afraid that will be very unpopular among hardware designers because it
> messes with their highly appreciated and scarce L1 cache. But funnily
> enough, I had the impresion before starting this project that some kind of
> simple mechanism for complex jumping would be a good idea. I will keep this
> in mind when looking for patterns in the assembly code.
>
>
> Once more, thank you so much for your work and the help, and I hope I can
> deliver soon information that you all could find interesting.
>
>
> Have a very nice weekend!
>
> Cheers,
>
> Dani.
>
>
> On 17/4/25 9:19, Sven Tennie wrote:
>
> Hey Daniel 👋
>
> That's really an interesting topic, because we never analyzed the emitted
> RISC-V assembly with statistical measures.
>
> So, if I may ask for a favour: If you spot anything that could be better
> expressed with the current ISA, please open a ticket and label it as
> RISC-V: https://gitlab.haskell.org/ghc/ghc/-/issues
> (We haven't decided which RISC-V profile to require. I.e. requiring the
> very latest extensions would frustrate people with older hardware...
> However, it's in anycase good to have possible improvements documented in
> tickets.)
>
> I'm wondering if you really have to go through QEMU. Or, if feeding
> assembly code to a parser and then doing the math on that wouldn't be
> sufficient? (Of course, tracing the execution is more accurate. However,
> it's much more complicated as well.)
>
> To account Assembly instructions to Cmm statements you may use the GHC
> parameters -ddump-cmm and -dppr-debug (and to stream this into files
> instead of stdout -ddump-to-file.) This will add comments for most Cmm
> statements into the dumped assembly code.
>
> At first, I thought that sign-extension / truncation might be a good
> candidate. However, it turned out that this is already covered by the
> RISC-V B-extension. Which led to this new ticket:
> https://gitlab.haskell.org/ghc/ghc/-/issues/25966
>
> Skimming over the NCG code and watching out for longer or repeating
> instruction lists might be a good strategy to make educated guesses.
>
> From a developer's perspective, I found the immediate sizes (usually
> 12bit) rather limiting. E.g. the Note [RISCV64 far jumps] (
> https://gitlab.haskell.org/ghc/ghc/-/blob/395e0ad17c0d309637f079a05dbdc23e0d4188f6/compiler/GHC/CmmToAsm/RV64/CodeGen.hs?page=2#L1996)
> tells a story how we had to work around this limit for addresses in
> conditional jumps.
>
> So, you could raise the question if - analog to compressed expressions -
> it wouldn't make sense to have extended expressions that cover two words.
> Such that the first word is the instruction and the second it's
> immediate(s). (Hardware designers would probably hate that, because it
> means a bigger change to the instruction decoding unit. However, I got
> asked as a software developer ;) )
>
> Other than that, I've unfortunately got no great ideas.
>
> Please feel free to keep us in the loop (especially regarding the results
> of your analyses.) And, if you've got any questions regarding the RISC-V
> NCG, please feel free to reach out either here or directly to me. There's
> also a #GHC "room" on Matrix where you can quickly drop smaller scoped
> questions.
>
> I hope that was of any help. Best regards,
>
> Sven
>
> Am Mi., 16. Apr. 2025 um 10:34 Uhr schrieb Matthew Pickering <
> matthewtpickering at gmail.com>:
>
>> Hi Daniel. I think Sven Tennie and the LoongArch contributors are the
>> experts in NCG for these kinds of instruction sets. I have cced them.
>>
>> Cheers,
>>
>> Matt
>>
>> On Tue, Apr 15, 2025 at 5:40 PM Daniel Trujillo Viedma <
>> danihacker.viedma at gmail.com> wrote:
>>
>>> Hello, ghc-devs! My name is Daniel Trujillo, I'm a Haskell enthusiast
>>> from Spain and I'm trying to make my Master's thesis about accelerating
>>> Haskell programs with a custom ISA extension.
>>>
>>>
>>> Right now, my focus is in executing software written in Haskell within
>>> QEMU in order to get traces that tells me, basically, how many times
>>> each block (not exactly basic blocks, but sort of) of assembly code has
>>> been executed, with the hope of finding some patterns of RISCV
>>> instructions that I could implement together into 1 instruction.
>>>
>>>
>>> As you can see, my method is a bit crude, and I was wondering if the
>>> people involved with any of the different internal representations (STG,
>>> Cmm...) and/or native code generators (particularly RISCV) could provide
>>> me hints about assembly instructions that would have made the work
>>> easier, by removing the need of "massaging" the Cmm code to make CodeGen
>>> easier, or the need of particular optimizations, or in general, dirty
>>> tricks because of lacking of proper support of the standard RISCV ISA.
>>>
>>>
>>> And of course, I would also appreciate very much other hints from people
>>> involved in general performance (as oppossed to, for example, libraries
>>> for SIMD and parallel execution, or Haskell wrappers to lower-level code
>>> for performance reasons).
>>>
>>>
>>> P.D. I'm sorry if I broke any netiquette rule, but I'm very new to the
>>> email list, and haven't received yet any email from it.
>>>
>>>
>>> Looking forward to hear from you!
>>>
>>> Cheers,
>>>
>>> Dani.
>>>
>>> _______________________________________________
>>> ghc-devs mailing list
>>> ghc-devs at haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20250418/c2e4430b/attachment.html>


More information about the ghc-devs mailing list