optimize floatscan downscaler to skip results that won't be needed

when upscaling, even the very last digit is needed in cases where the input is exact; no digits can be discarded. but when downscaling, any digits less significant than the mantissa bits are destined for the great bitbucket; the only influence they can have is their presence (being nonzero). thus, we simply throw them away early. the result is nearly a 4x performance improvement for processing huge values. the particular threshold LD_B1B_DIG+3 is not chosen sharply; it's simply a "safe" distance past the significant bits. it would be nice to replace it with a sharp bound, but i suspect performance will be comparable (within a few percent) anyway.
author: Rich Felker <dalias@aerifal.cx> 2012-04-11 14:51:08 -0400
committer: Rich Felker <dalias@aerifal.cx> 2012-04-11 14:51:08 -0400
commit: 4054da9ba062c694dc4fde5c577fcb6da7743bc9 (patch)
tree: 9f0e3246c2ac8a76d0e0e568b543de385a138c15 /src/internal
parent: 5837a0bb6b5cf516f79527e837368af0b494d51a (diff)
download: musl-4054da9ba062c694dc4fde5c577fcb6da7743bc9.tar.gz
1 files changed, 3 insertions, 2 deletions
diff --git a/src/internal/floatscan.c b/src/internal/floatscan.c
index 6390d46a..b2313293 100644
--- a/src/internal/floatscan.c
+++ b/src/internal/floatscan.c
@@ -200,16 +200,17 @@ static long double decfloat(FILE *f, int c, int bits, int emin, int sign, int po
 		/* FIXME: find a way to compute optimal sh */
 		if (rp > 9+9*LD_B1B_DIG) sh = 9;
 		e2 += sh;
-		for (k=a; k!=z; k=(k+1 & MASK)) {
+		for (i=0; (k=(a+i & MASK))!=z && i<LD_B1B_DIG+3; i++) {
 			uint32_t tmp = x[k] & (1<<sh)-1;
 			x[k] = (x[k]>>sh) + carry;
 			carry = (1000000000>>sh) * tmp;
 			if (k==a && !x[k]) {
 				a = (a+1 & MASK);
+				i--;
 				rp -= 9;
 			}
 		}
-		if (carry) {
+		if (carry && k==z) {
 			if ((z+1 & MASK) != a) {
 				x[z] = carry;
 				z = (z+1 & MASK);
author	Rich Felker <dalias@aerifal.cx>	2012-04-11 14:51:08 -0400
committer	Rich Felker <dalias@aerifal.cx>	2012-04-11 14:51:08 -0400
commit	4054da9ba062c694dc4fde5c577fcb6da7743bc9 (patch)
tree	9f0e3246c2ac8a76d0e0e568b543de385a138c15 /src/internal
parent	5837a0bb6b5cf516f79527e837368af0b494d51a (diff)
download	musl-4054da9ba062c694dc4fde5c577fcb6da7743bc9.tar.gz