<div dir="ltr">You are unneccessary overly pessimistic, let me show you somethings you, probably, have not thought or heard about.<br><div><div class="gmail_extra"><br><div class="gmail_quote">2016-01-20 10:51 GMT+03:00 Joachim Durchholz <span dir="ltr"><<a href="mailto:jo@durchholz.org" target="_blank">jo@durchholz.org</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="">Am 19.01.2016 um 23:12 schrieb Henning Thielemann:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Fortunately, there are<br>

processors that are designed for custom instruction set extensions:<br>

    <a href="https://en.wikipedia.org/wiki/Xtensa" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/Xtensa</a><br>

</blockquote>

<br></span>

Unfortunately, the WP article does not say anything that couldn't be said about, say, an ARM core. Other than that Xtensa core being some VLIW design.<span class=""><br>

<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

Would it be sensible to create a processor based on such a design?<br>

</blockquote>

<br></span>

Very, very unlikely, for multiple reasons.<br>

<br>

Special-purpose CPUs have been built, most notably for LISP, less notably for Java, and probably for other purposes that I haven't heard of.<br>

Invariably, their architectural advantages were obsoleted by economy of scale: Mainstream CPUs are being produced in such huge numbers that Intel etc. could affort more engineers to optimize every nook and cranny, more engineers to optimize the structure downscaling, and larger fabs that could do more chips on more one-time expensive but per-piece cheap equipment, and in the end, the special-purpose chips were slower and more expensive. It's an extremely strong competition you are facing if you try this.<br>

<br>

Also, it is very easy to misidentify the actual bottlenecks and make instructions for the wrong ones.<br>

If caching is the main bottleneck (which it usually is), no amount of CPU improvement will help you and you'll simply need a larger cache. Or, probably, a compiler that knows enough about the program and its data flow to arrange the data in a cache-line-friendly fashion.<br></blockquote><div><br></div><div>A demonstration from the industry, albeit not quite hardware industry:<br></div><div><br><a href="http://www.disneyanimation.com/technology/innovations/hyperion">http://www.disneyanimation.com/technology/innovations/hyperion</a> - "Hyperion  handles  several million light rays at a time by sorting and 

bundling them together according to their directions. When the rays are 

grouped in this way, many of the rays in a bundle hit the same object in

 the same region of space. This similarity of ray hits then allows us – 

and the computer – to optimize the calculations for the objects hit."<br><br></div><div>Then, let me bring up an old idea of mine: <a href="https://mail.haskell.org/pipermail/haskell-cafe/2009-August/065327.html">https://mail.haskell.org/pipermail/haskell-cafe/2009-August/065327.html</a><br><br></div><div>Basically, we can group identical closures into vectors, ready for SIMD instructions to operate over them. The "vectors" should work just like Data.Vector.Unboxed - instead of vector of tuple of arguments there should be a tuple of vectors with individual arguments (and results to update for lazy evaluation).<br><br></div><div>Combine this with sorting of addresses in case of references and you can get a lot of speedup by doing... not much.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

I do not think this is going to be a long-term problem though. Pure languages have huge advantages for fine-grained parallel processing, and CPU technology is pushing towards multiple cores, so that's a natural match. As pure languages come into more widespread use, the engineers at Intel, AMD etc. will look at what the pure languages need, and add optimizations for these.<br>

<br>

Just my 2 cents.<br>

Jo<div class=""><div class="h5"><br>

_______________________________________________<br>

Haskell-Cafe mailing list<br>

<a href="mailto:Haskell-Cafe@haskell.org" target="_blank">Haskell-Cafe@haskell.org</a><br>

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe" rel="noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe</a><br>

</div></div></blockquote></div><br></div></div></div>