<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 20 Jan 2016, at 9:12 am, Henning Thielemann <<a href="mailto:lemming@henning-thielemann.de" class="">lemming@henning-thielemann.de</a>> wrote:</div></blockquote><br class=""><blockquote type="cite" class=""><div class="">I got to know that in todays x86 processors you can alter the instruction set, which is mainly used for bugfixes. Wouldn't it be interesting to add some instructions for Haskell support? However, I suspect that such a patch might be rendered invalid by new processor generations with changed internal details. Fortunately, there are processors that are designed for custom instruction set extensions:<br class=""> <a href="https://en.wikipedia.org/wiki/Xtensa" class="">https://en.wikipedia.org/wiki/Xtensa</a><br class=""></div></blockquote></div><div class=""><br class=""></div><div class="">Your post assumes that the time to fetch/decode the instruction stream is a bottleneck, and reducing the number of instructions will in some way make the program faster.</div><div class=""><br class=""></div><div class="">Your typically lazy GHC compiled program spends much of its time building thunks and otherwise copying data between the stack and the heap. If it’s blocked waiting for data memory / data cache miss then reducing the number of instructions won’t help anything — at least if the fancy new instructions just tell the processor to do something that would lead to cache miss anyway.</div><div class=""><br class=""></div><div class="">See: Cache Performance of Lazy Functional Programs on Current Hardware (from 2009)</div><div class="">Arbob Ahmad and Henry DeYoung</div><div class=""><a href="http://www.cs.cmu.edu/~hdeyoung/15740/report.pdf" class="">http://www.cs.cmu.edu/~hdeyoung/15740/report.pdf</a></div><div class=""><br class=""></div><div class="">Indirect branches are also a problem (load an address from data memory, then jump to it), as branch predictors usually cannot deal with them. Slowdowns due to mispredicted branches could perhaps be mitigated by improving the branch predictor in a Haskell specific way, but you might not need new instructions to do so.</div><div class=""><br class=""></div><div class="">Or another way of putting it: “If you tell a long story with less words, then it’s still a long story.”</div><div class=""><br class=""></div><div class="">Ben.</div><div class=""><br class=""></div></body></html>