Questions regarding the RISCV native codegen & performance
Daniel Trujillo Viedma
danihacker.viedma at gmail.com
Sat Apr 19 19:23:52 UTC 2025
Thank you so much, again!
I can't overstate how useful all these recommendations are, and I'm
looking forward to see how these play out in simulations. So I guess
it's time for me to work!
Sincerely yours,
Dani.
On 18/4/25 14:41, Sven Tennie wrote:
> Hey Daniel 👋
>
> Thanks a lot for your kind words.
>
> The AArch64 ISA might also be some source of inspiration. AArch64 has
> some combined instructions which RISC-V hasn't. E.g ADD of two
> registers with an included shift. Though, we don't seem to use many of
> them and I haven't found any usage that wouldn't be well covered in
> RISC-V NCG. Probably, that's because MachOp
> (https://hackage.haskell.org/package/ghc-9.12.1/docs/GHC-Cmm-MachOp.html#t:MachOp)
> is pretty fine grained.
>
> A good candidate for investigations could be the CSET
> pseudo-instruction. I stumbled over it while looking for pseudo-ops
> which lead to combined instructions in AArch64 NCG. The CSET pseudo-op
> leads to two instructions in RISC-V NCG and to one in AArch64 NCG:
> -
> https://gitlab.haskell.org/ghc/ghc/-/blob/386f18548e3c66d04f648a9d34f167a086c1328b/compiler/GHC/CmmToAsm/AArch64/Ppr.hs#L443
> -
> https://gitlab.haskell.org/ghc/ghc/-/blob/386f18548e3c66d04f648a9d34f167a086c1328b/compiler/GHC/CmmToAsm/RV64/Ppr.hs#L554
>
> Though, this might be a sub-optimal implementation (in this case we'd
> be happy to get a ticket ;) ). As CSET is used for comparisons, it
> should appear pretty frequently.
>
> A bit off-topic, but for the sake of completeness: The Compiler
> Explorer seems to use DWARF symbols to map assembly instructions to
> Haskell code lines. At least, it compiles with -g.
>
> VELDT's profile is stated as "RV32I (no FENCE, ECALL, EBREAK)" on
> their Github page. But, we target RV64G with both, the NCG and LLVM
> backends. (The main reason to not support simpler profiles is that all
> hardware on the market that is powerful enough to reasonably run
> Haskell supports at least RV64G.)
>
> Thanks for the hint about the J-extension. I will take a look at it.
>
> Enjoy your weekend & best regards,
>
> Sven
>
> Am Fr., 18. Apr. 2025 um 12:13Â Uhr schrieb Daniel Trujillo Viedma
> <danihacker.viedma at gmail.com>:
>
> Thank you so much for all the information and the help.
>
>
> Seriously, this is much more than I was hoping to get, even the
> suggestion for generating commented assembly code (which is, I
> assume, the method that Compiler Explorer uses to relate the
> high-level Haskell code with the assembly output of the compiler,
> which is really nice). And four your RISC-V NGC, which I found
> easy to understand.
>
>
> I guess this is the kind of professionals that Haskell attracts,
> which is a big part of why I love it.
>
>
> I will send here an executive summary of my findings, including
> statistics about a couple of programs that I try. I don't know if
> I'll be able to do a very statistical significant analysis,
> because I'll still have a lot of things to do (to extend QEMU,
> maybe also gem5, and implement it in a Clash microprocessor
> design, probably VELDT), but maybe in the future I can automate
> more of it and running a more comprehensive analysis. FYI, I have
> found that the RISC-V specs mention Haskell among other languages
> in a still empty J extension section, which will be aimed at
> helping dynamically translated languages as well as
> garbage-collected, but I guess RISC-V people is still more focused
> on other things and it will take some time to start work on that
> extension.
>
>
> I find also very interesting your suggestion for far-jumping, but
> I'm afraid that will be very unpopular among hardware designers
> because it messes with their highly appreciated and scarce L1
> cache. But funnily enough, I had the impresion before starting
> this project that some kind of simple mechanism for complex
> jumping would be a good idea. I will keep this in mind when
> looking for patterns in the assembly code.
>
>
> Once more, thank you so much for your work and the help, and I
> hope I can deliver soon information that you all could find
> interesting.
>
>
> Have a very nice weekend!
>
> Cheers,
>
> Dani.
>
>
> On 17/4/25 9:19, Sven Tennie wrote:
>> Hey Daniel 👋
>>
>> That's really an interesting topic, because we never analyzed the
>> emitted RISC-V assembly with statistical measures.
>>
>> So, if I may ask for a favour: If you spot anything that could be
>> better expressed with the current ISA, please open a ticket and
>> label it as RISC-V: https://gitlab.haskell.org/ghc/ghc/-/issues
>> (We haven't decided which RISC-V profile to require. I.e.
>> requiring the very latest extensions would frustrate people with
>> older hardware... However, it's in anycase good to have possible
>> improvements documented in tickets.)
>>
>> I'm wondering if you really have to go through QEMU. Or, if
>> feeding assembly code to a parser and then doing the math on that
>> wouldn't be sufficient? (Of course, tracing the execution is more
>> accurate. However, it's much more complicated as well.)
>>
>> To account Assembly instructions to Cmm statements you may use
>> the GHC parameters -ddump-cmm and -dppr-debug (and to stream this
>> into files instead of stdout -ddump-to-file.) This will add
>> comments for most Cmm statements into the dumped assembly code.
>>
>> At first, I thought that sign-extension / truncation might be a
>> good candidate. However, it turned out that this is already
>> covered by the RISC-V B-extension. Which led to this new ticket:
>> https://gitlab.haskell.org/ghc/ghc/-/issues/25966
>>
>> Skimming over the NCG code and watching out for longer or
>> repeating instruction lists might be a good strategy to make
>> educated guesses.
>>
>> From a developer's perspective, I found the immediate sizes
>> (usually 12bit) rather limiting. E.g. the Note [RISCV64 far
>> jumps]
>> (https://gitlab.haskell.org/ghc/ghc/-/blob/395e0ad17c0d309637f079a05dbdc23e0d4188f6/compiler/GHC/CmmToAsm/RV64/CodeGen.hs?page=2#L1996)
>> tells a story how we had to work around this limit for addresses
>> in conditional jumps.
>>
>> So, you could raise the question if - analog to compressed
>> expressions - it wouldn't make sense to have extended expressions
>> that cover two words. Such that the first word is the instruction
>> and the second it's immediate(s). (Hardware designers would
>> probably hate that, because it means a bigger change to the
>> instruction decoding unit. However, I got asked as a software
>> developer ;) )
>>
>> Other than that, I've unfortunately got no great ideas.
>>
>> Please feel free to keep us in the loop (especially regarding the
>> results of your analyses.) And, if you've got any questions
>> regarding the RISC-V NCG, please feel free to reach out either
>> here or directly to me. There's also a #GHC "room" on Matrix
>> where you can quickly drop smaller scoped questions.
>>
>> I hope that was of any help. Best regards,
>>
>> Sven
>>
>> Am Mi., 16. Apr. 2025 um 10:34Â Uhr schrieb Matthew Pickering
>> <matthewtpickering at gmail.com>:
>>
>> Hi Daniel. I think Sven Tennie and the LoongArch contributors
>> are the experts in NCG for these kinds of instruction sets. I
>> have cced them.
>>
>> Cheers,
>>
>> Matt
>>
>> On Tue, Apr 15, 2025 at 5:40 PM Daniel Trujillo Viedma
>> <danihacker.viedma at gmail.com> wrote:
>>
>> Hello, ghc-devs! My name is Daniel Trujillo, I'm a
>> Haskell enthusiast
>> from Spain and I'm trying to make my Master's thesis
>> about accelerating
>> Haskell programs with a custom ISA extension.
>>
>>
>> Right now, my focus is in executing software written in
>> Haskell within
>> QEMU in order to get traces that tells me, basically, how
>> many times
>> each block (not exactly basic blocks, but sort of) of
>> assembly code has
>> been executed, with the hope of finding some patterns of
>> RISCV
>> instructions that I could implement together into 1
>> instruction.
>>
>>
>> As you can see, my method is a bit crude, and I was
>> wondering if the
>> people involved with any of the different internal
>> representations (STG,
>> Cmm...) and/or native code generators (particularly
>> RISCV) could provide
>> me hints about assembly instructions that would have made
>> the work
>> easier, by removing the need of "massaging" the Cmm code
>> to make CodeGen
>> easier, or the need of particular optimizations, or in
>> general, dirty
>> tricks because of lacking of proper support of the
>> standard RISCV ISA.
>>
>>
>> And of course, I would also appreciate very much other
>> hints from people
>> involved in general performance (as oppossed to, for
>> example, libraries
>> for SIMD and parallel execution, or Haskell wrappers to
>> lower-level code
>> for performance reasons).
>>
>>
>> P.D. I'm sorry if I broke any netiquette rule, but I'm
>> very new to the
>> email list, and haven't received yet any email from it.
>>
>>
>> Looking forward to hear from you!
>>
>> Cheers,
>>
>> Dani.
>>
>> _______________________________________________
>> ghc-devs mailing list
>> ghc-devs at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-devs/attachments/20250419/8d628191/attachment.html>
More information about the ghc-devs
mailing list