<p dir="ltr">Hrm, now that ive thought about it a wee bit more,perhaps the rounding mode info needs to be attached to ghc threads, otherwise there will be some fun bugs in multithreaded code that uses multiple rounded modes. I'll do some investigation. </p>
<div class="gmail_quote">On May 5, 2015 8:16 AM, "Carter Schonwald" <<a href="mailto:carter.schonwald@gmail.com">carter.schonwald@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">To clarify: I think theres a bit of an open design question how the explicitly moded api would look. I'd suspect it'll look somewhat like Ed's AD lib, and should be in a userland library I think. </p>
<div class="gmail_quote">On May 5, 2015 7:40 AM, "Carter Schonwald" <<a href="mailto:carter.schonwald@gmail.com" target="_blank">carter.schonwald@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">Hey Levent,<br>
I actually looked into how to do rounding mode setting a while ago, and the conclusion I came to is that those can simply be ffi calls at the top level that do a sort of with mode bracketing. Or at least I'm not sure if setting the mode in an inner loop is a good idea. </p>
<p dir="ltr"> That said, you are making a valid point, and I will investigate to what extent compiler support is useful for the latter. If bracketed mode setting and unsetting has a small enough performance overhead, adding support in ghc primops would be worth while. Note that those primops would have to be modeled as doing something thats like io or st, so that when mode switches happen can be predictable. Otherwise CSE and related optimizations could result in evaluating the same code in the wrong mode. I'll think through how that can be avoided, as I do have some ideas. </p>
<p dir="ltr">I suspect mode switching code will wind up using new type wrapped floats and doubles that have a phantom index for the mode, and something like "runWithModeFoo:: Num a => Mode m->(forall s . Moded s a ) -> a" to make sure mode choices happen predictably. That said, there might be a better approach that we'll come to after some experimenting</p>
<div class="gmail_quote">On May 5, 2015 12:54 AM, "Levent Erkok" <<a href="mailto:erkokl@gmail.com" target="_blank">erkokl@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Carter: Wall of text is just fine!<div><br></div><div>I'm personally happy to see the results of your experiment. In particular, the better "code-generation" facilities you add around floats/doubles that map to the underlying hardware's native instructions, the better. When we do have proper IEEE floats, we shall surely need all that functionality.</div><div><br></div><div>While you're working on this, if you can also watch out for how rounding modes can be integrated into the operations, that would be useful as well. I can see at least two designs:</div><div><br></div><div> * One where the rounding mode goes with the operation: `fpAdd RoundNearestTiesToEven 2.5 6.4`. This is the "cleanest" and the functional solution, but could get quite verbose; and might be costly if the implementation changes the rounding-mode at every issue.</div><div> </div><div> * The other is where the operations simply assume the RoundNearestTiesToEven, but we have lifted IO versions that can be modified with a "with" like construct: `withRoundingMode RoundTowardsPositive $ fpAddRM 2.5 6.4`. Note that `fpAddRM` (*not* `fpAdd` as before) will have to return some sort of a monadic value (probably in the IO monad) since it'll need to access the rounding mode currently active.</div><div><br></div><div>Neither choice jumps out at me as the best one; and a hybrid might also be possible. I'd love to hear any insight you gain regarding rounding-modes during your experiment.</div><div><br></div><div>-Levent.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 4, 2015 at 7:54 PM, Carter Schonwald <span dir="ltr"><<a href="mailto:carter.schonwald@gmail.com" target="_blank">carter.schonwald@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>pardon the wall of text everyone, but I really want some FMA tooling :) </div><div><br></div>I am going to spend some time later this week and next adding FMA primops to GHC and playing around with different ways to add it to Num (which seems pretty straightforward, though I think we'd all agree it shouldn't be exported by Prelude). And then depending on how Yitzchak's reproposal of that exactly goes (or some iteration thereof) we can get something useful/usable into 7.12<div><br></div><div>i have codes (ie <b>dotproducts</b>!!!!!) where a faster direct FMA for <b>exact numbers</b>, and a higher precision FMA for <b>approximate numbers </b>(<b>ie floating point</b>), and where I cant sanely use FMA if it lives anywhere but Num unless I rub typeable everywhere and do runtime type checks for applicable floating point types, which kinda destroys parametrically in engineering nice things. </div><div><br></div><div>@levent: ghc doesn't do any optimization for floating point arithmetic (aside from 1-2 very simple things that are possibly questionable), and until ghc has support for precisly emulating high precision floating point computation in a portable way, probably wont have any interesting floating point computation. Mandating that fma a b c === a*b+c for inexact number datatypes doesn't quite make sense to me. Relatedly, its a GOOD thing ghc is conservative about optimizing floating point, because it makes doing correct stability analyses tractable! I look forward to the day that GHC gets a bit more sophisticated about optimizing floating point computation, but that day is still a ways off.</div><div><br></div><div>relatedly: FMA for float and double are not generally going to be faster than the individual primitive operations, merely more accurate when used carefully. </div><div><br></div><div>point being<b>, i'm +1 on adding some manner of FMA operations to Num</b> (only sane place to put it where i can actually use it for a general use library) and i dont really care if we name it fusedMultiplyAdd, multiplyAndAdd accursedFusionOfSemiRingOperations, or fma. i'd favor "fusedMultiplyAdd" if we want a descriptive name that will be familiar to experts yet easy to google for the curious. </div><div><br></div><div>to repeat: i'm going to do some leg work so that the double and float prims are portably exposed by ghc-prims (i've spoken with several ghc devs about that, and they agree to its value, and thats a decision outside of scope of the libraries purview), and I do hope we can to a consensus about putting it in Num so that expert library authors can upgrade the guarantees that they can provide end users without imposing any breaking changes to end users.</div><div><br></div><div>A number of folks have brought up "but Num is broken" as a counter argument to adding FMA support to Num. I emphatically agree num is borken :), BUT! I do also believe that fixing up Num prelude has the burden of providing a whole cloth design for an alternative design that we can get broad consensus/adoption with. That will happen by dint of actually experimentation and usage. </div><div><br></div><div>Point being, adding FMA doesn't further entrench current Num any more than it already is, it just provides expert library authors with a transparent way of improving the experience of their users with a free upgrade in answer accuracy if used carefully. Additionally, when Num's "semiring ish equational laws" are framed with respect to approximate forwards/backwards stability, there is a perfectly reasonable law for FMA. I am happy to spend some time trying to write that up more precisely IFF that will tilt those in opposition to being in favor. </div><div><br></div><div>I dont need FMA to be exposed by <b>prelude/base</b>, merely by <b>GHC.Num</b> as a method therein for Num. If that constitutes a different and <b>more palatable proposal</b> than what people have articulated so far (by discouraging casual use by dint of hiding) then I am happy to kick off a new thread with that concrete design choice. </div><div><br></div><div>If theres a counter argument thats a bit more substantive than "Num is for exact arithmetic" or "Num is wrong" that will sway me to the other side, i'm all ears, but i'm skeptical of that. </div><div><br></div><div>I emphatically support those who are displeased with Num to prototype some alternative designs in userland, I do think it'd be great to figure out a new Num prelude we can migrate Haskell / GHC to over the next 2-5 years, but again any such proposal really needs to be realized whole cloth before it makes its way to being a libraries list proposal.</div><div><br></div><div><br></div><div>again, pardon the wall of text, i just really want to have nice things :) </div><span><font color="#888888"><div>-Carter</div><div><br></div></font></span></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div>On Mon, May 4, 2015 at 2:22 PM, Levent Erkok <span dir="ltr"><<a href="mailto:erkokl@gmail.com" target="_blank">erkokl@gmail.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div dir="ltr">I think `mulAdd a b c` should be implemented as `a*b+c` even for Double/Float. It should only be an "optmization" (as in modular arithmetic), not a semantic changing operation. Thus justifying the optimization.<div><br></div><div>"fma" should be the "more-precise" version available for Float/Double. I don't think it makes sense to have "fma" for other types. That's why I'm advocating "mulAdd" to be part of "Num" for optimization purposes; and "fma" reserved for true IEEE754 types and semantics.</div><div><br></div><div>I understand that Edward doesn't like this as this requires a different class; but really, that's the price to pay if we claim Haskell has proper support for IEEE754 semantics. (Which I think it should.) The operation is just different. It also should account for the rounding-modes properly.</div><div><br></div><div>I think we can pull this off just fine; and Haskell can really lead the pack here. The situation with floats is even worse in other languages. This is our chance to make a proper implementation, and we have the right tools to do so.</div><span><font color="#888888"><div><br></div><div>-Levent.</div></font></span></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 4, 2015 at 10:58 AM, Artyom <span dir="ltr"><<a href="mailto:yom@artyom.me" target="_blank">yom@artyom.me</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span>
On 05/04/2015 08:49 PM, Levent Erkok wrote:<br>
<blockquote type="cite">
<div dir="ltr">Artyom: That's precisely the point. The true
IEEE754 variants where precision does matter should be part of a
different class. What Edward and Yitz want is an "optimized"
multiply-add where the semantics is the same but one that goes
faster.</div>
</blockquote></span>
No, it looks to me that Edward wants to have a more precise
operation in Num:<span><br>
<blockquote type="cite">I'd have to make a second copy of the
function to even try to see the precision win.</blockquote></span>
Unless I'm wrong, you can't have the following things
simultaneously:<br>
<ol>
<li>the compiler is free to substitute <i>a+b*c</i> with <i>mulAdd
a b c</i></li>
<li><i>mulAdd a b c</i> is implemented as <i>fma</i> for Doubles
(and is more precise)</li>
<li>Num operations for Double (addition and multiplication) always
conform to IEEE754</li>
</ol>
<p><span>
</span></p><blockquote type="cite">The true IEEE754 variants where precision
does matter should be part of a different class.</blockquote>
So, does it mean that you're fine with not having point #3 because
people who need it would be able to use a separate class for
IEEE754 floats?<br>
<p></p>
</div>
</blockquote></div><br></div>
</div></div><br></div></div><span>_______________________________________________<br>
Libraries mailing list<br>
<a href="mailto:Libraries@haskell.org" target="_blank">Libraries@haskell.org</a><br>
<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries</a><br>
<br></span></blockquote></div><br></div>
</blockquote></div><br></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>