summaryrefslogtreecommitdiff
path: root/src/math/fma.c
AgeCommit message (Collapse)AuthorLines
2013-05-19math: add fma TODO comments about the underflow issueSzabolcs Nagy-0/+2
The underflow exception is not raised correctly in some cornercases (see previous fma commit), added comments with examples for fmaf, fmal and non-x86 fma. In fmaf store the result before returning so it has the correct precision when FLT_EVAL_METHOD!=0
2013-05-19math: fix two fma issues (only affects non-nearest rounding mode, x86)Szabolcs Nagy-4/+38
1) in downward rounding fma(1,1,-1) should be -0 but it was 0 with gcc, the code was correct but gcc does not support FENV_ACCESS ON so it used common subexpression elimination where it shouldn't have. now volatile memory access is used as a barrier after fesetround. 2) in directed rounding modes there is no double rounding issue so the complicated adjustments done for nearest rounding mode are not needed. the only exception to this rule is raising the underflow flag: assume "small" is an exactly representible subnormal value in double precision and "verysmall" is a much smaller value so that (long double)(small plus verysmall) == small then (double)(small plus verysmall) raises underflow because the result is an inexact subnormal, but (double)(long double)(small plus verysmall) does not because small is not a subnormal in long double precision and it is exact in double precision. now this problem is fixed by checking inexact using fenv when the result is subnormal
2012-11-13math: use '#pragma STDC FENV_ACCESS ON' when fenv is accessedSzabolcs Nagy-0/+2
2012-06-20math: fix fma bug on x86 (found by Bruno Haible with gnulib)nsz-2/+10
The long double adjustment was wrong: The usual check is mant_bits & 0x7ff == 0x400 before doing a mant_bits++ or mant_bits-- adjustment since this is the only case when rounding an inexact ld80 into double can go wrong. (only in nearest rounding mode) After such a check the ++ and -- is ok (the mantissa will end in 0x401 or 0x3ff). fma is a bit different (we need to add 3 numbers with correct rounding: hi_xy + lo_xy + z so we should survive two roundings at different places without precision loss) The adjustment in fma only checks for zero low bits mant_bits & 0x3ff == 0 this way the adjusted value is correct when rounded to double or *less* precision. (this is an important piece in the fma puzzle) Unfortunately in this case the -- is not a correct adjustment because mant_bits might underflow so further checks are needed and this was the source of the bug.
2012-03-19use scalbn or *2.0 instead of ldexp, fix fmalnsz-6/+6
Some code assumed ldexp(x, 1) is faster than 2.0*x, but ldexp is a wrapper around scalbn which uses multiplications inside, so this optimization is wrong. This commit also fixes fmal which accidentally used ldexp instead of ldexpl loosing precision. There are various additional changes from the work-in-progress const cleanups.
2012-03-19remove unnecessary TODO comments from fma.cnsz-5/+1
2012-03-19add fma implementation for x86nsz-7/+143
correctly rounded double precision fma using extended precision arithmetics for ld80 systems (x87)
2012-03-16make fma and lrint functions build without full fenv supportRich Felker-2/+12
this is necessary to support archs where fenv is incomplete or unavailable (presently arm). fma, fmal, and the lrint family should work perfectly fine with this change; fmaf is slightly broken with respect to rounding as it depends on non-default rounding modes to do its work.
2012-03-13first commit of the new libm!Rich Felker-0/+270
thanks to the hard work of Szabolcs Nagy (nsz), identifying the best (from correctness and license standpoint) implementations from freebsd and openbsd and cleaning them up! musl should now fully support c99 float and long double math functions, and has near-complete complex math support. tgmath should also work (fully on gcc-compatible compilers, and mostly on any c99 compiler). based largely on commit 0376d44a890fea261506f1fc63833e7a686dca19 from nsz's libm git repo, with some additions (dummy versions of a few missing long double complex functions, etc.) by me. various cleanups still need to be made, including re-adding (if they're correct) some asm functions that were dropped.