<div dir="ltr"><div>Hey Daniel 👋</div><div><br></div><div>That's really an interesting topic, because we never analyzed the emitted RISC-V assembly with statistical measures.</div><div><br></div><div>So, if I may ask for a favour: If you spot anything that could be better expressed with the current ISA, please open a ticket and label it as RISC-V: <a href="https://gitlab.haskell.org/ghc/ghc/-/issues">https://gitlab.haskell.org/ghc/ghc/-/issues</a> <br></div><div>(We haven't decided which RISC-V profile to require. I.e. requiring the very latest extensions would frustrate people with older hardware... However, it's in anycase good to have possible improvements documented in tickets.)<br></div><div><br></div><div>I'm wondering if you really have to go through QEMU. Or, if feeding assembly code to a parser and then doing the math on that wouldn't be sufficient? (Of course, tracing the execution is more accurate. However, it's much more complicated as well.)</div><div><br></div><div>To account Assembly instructions to Cmm statements you may use the GHC parameters -ddump-cmm and -dppr-debug (and to stream this into files instead of stdout -ddump-to-file.) This will add comments for most Cmm statements into the dumped assembly code.</div><div><br></div><div>At first, I thought that sign-extension / truncation might be a good candidate. However, it turned out that this is already covered by the RISC-V B-extension. Which led to this new ticket: <a href="https://gitlab.haskell.org/ghc/ghc/-/issues/25966">https://gitlab.haskell.org/ghc/ghc/-/issues/25966</a></div><div><br></div><div>Skimming over the NCG code and watching out for longer or repeating instruction lists might be a good strategy to make educated guesses.</div><div><br></div><div><span style="font-family:arial,sans-serif">From a developer's perspective, I found the immediate sizes (usually 12bit) rather limiting. E.g. the </span>Note [RISCV64 far jumps] (<a href="https://gitlab.haskell.org/ghc/ghc/-/blob/395e0ad17c0d309637f079a05dbdc23e0d4188f6/compiler/GHC/CmmToAsm/RV64/CodeGen.hs?page=2#L1996">https://gitlab.haskell.org/ghc/ghc/-/blob/395e0ad17c0d309637f079a05dbdc23e0d4188f6/compiler/GHC/CmmToAsm/RV64/CodeGen.hs?page=2#L1996</a>) tells a story how we had to work around this limit for addresses in conditional jumps.</div><div><br></div><div>So, you could raise the question if - analog to compressed expressions - it wouldn't make sense to have extended expressions that cover two words. Such that the first word is the instruction and the second it's immediate(s). (Hardware designers would probably hate that, because it means a bigger change to the instruction decoding unit. However, I got asked as a software developer ;) )</div><div><br></div><div>Other than that, I've unfortunately got no great ideas.</div><div><br></div><div>Please feel free to keep us in the loop (especially regarding the results of your analyses.) And, if you've got any questions regarding the RISC-V NCG, please feel free to reach out either here or directly to me. There's also a #GHC "room" on Matrix where you can quickly drop smaller scoped questions.</div><div><br></div><div>I hope that was of any help. Best regards,</div><div><br></div><div>Sven<br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">Am Mi., 16. Apr. 2025 um 10:34 Uhr schrieb Matthew Pickering <<a href="mailto:matthewtpickering@gmail.com">matthewtpickering@gmail.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hi Daniel. I think Sven Tennie and the LoongArch contributors are the experts in NCG for these kinds of instruction sets. I have cced them.</div><div><br></div><div>Cheers,</div><div><br></div><div>Matt</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 15, 2025 at 5:40 PM Daniel Trujillo Viedma <<a href="mailto:danihacker.viedma@gmail.com" target="_blank">danihacker.viedma@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello, ghc-devs! My name is Daniel Trujillo, I'm a Haskell enthusiast <br>
from Spain and I'm trying to make my Master's thesis about accelerating <br>
Haskell programs with a custom ISA extension.<br>
<br>
<br>
Right now, my focus is in executing software written in Haskell within <br>
QEMU in order to get traces that tells me, basically, how many times <br>
each block (not exactly basic blocks, but sort of) of assembly code has <br>
been executed, with the hope of finding some patterns of RISCV <br>
instructions that I could implement together into 1 instruction.<br>
<br>
<br>
As you can see, my method is a bit crude, and I was wondering if the <br>
people involved with any of the different internal representations (STG, <br>
Cmm...) and/or native code generators (particularly RISCV) could provide <br>
me hints about assembly instructions that would have made the work <br>
easier, by removing the need of "massaging" the Cmm code to make CodeGen <br>
easier, or the need of particular optimizations, or in general, dirty <br>
tricks because of lacking of proper support of the standard RISCV ISA.<br>
<br>
<br>
And of course, I would also appreciate very much other hints from people <br>
involved in general performance (as oppossed to, for example, libraries <br>
for SIMD and parallel execution, or Haskell wrappers to lower-level code <br>
for performance reasons).<br>
<br>
<br>
P.D. I'm sorry if I broke any netiquette rule, but I'm very new to the <br>
email list, and haven't received yet any email from it.<br>
<br>
<br>
Looking forward to hear from you!<br>
<br>
Cheers,<br>
<br>
Dani.<br>
<br>
_______________________________________________<br>
ghc-devs mailing list<br>
<a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>
</blockquote></div>
</blockquote></div>