Data inspector: round floats to precision/significant digits

Wishlists for new functionality and features.
Post Reply
Maël
Site Admin
Posts: 881
Joined: 12 Mar 2005 14:15

Data inspector: round floats to precision/significant digits

Post by Maël » 21 Jul 2018 02:29

The basis of this feature request was a mail sent to me:
One comment I had was that the single-precision floating point values are displayed with too many significant digits. For example, a value might be shown as "-0.0149999996647239", but really it only has 6 significant digits and should be shown as -0.015. (Of course, floats are not exact in many circumstances, so it would not be surprising to see "-0.0149998" or something for a different value).

Maybe you could implement a feature to shorten the representation of the single-precision float type, so that unnecessary digits are not shown?

Maël
Site Admin
Posts: 881
Joined: 12 Mar 2005 14:15

Re: Data inspector: round floats to precision/significant digits

Post by Maël » 21 Jul 2018 02:54

Floating point numbers are tricky. Interestingly I have been writing toy programs recently to convert floats to floating point binary strings (not floating point decimal) and back to their IEEE single or double float formats.

This requires more research to be done correctly, for now I'll rely on the Delphi internal StrToFloat/FloatToStr functions.

Some notes follow.


When considering how many digits to display, various factors have to be considered.

One major question is whether round-trip data conversion should merely not introduce errors (i.e., x==StrToFloat(FloatToStr(x)) should be true) and other digits that do not affect this result should be truncated or rounded off.
Or should the actually accurate representation as decimal float be shown, that most closely matches the binary float, even if it would not affect round-trip conversions?

Regarding the first option, Wikipedia (or the referenced paper) claims that:
"If an IEEE 754 single-precision number is converted to a decimal string with at least 9 significant digits, and then converted back to single-precision representation, the final result must match the original number.[5]"
This means: x==StrToFloat(FloatToStr(x)) is true if FloatToStr(x) has at least 9 significant digits.

This site states other values:
https://www.exploringbinary.com/decimal ... t-numbers/

But it also states that for the other round-trip direction only 6 significant digits need to be considered:
If a decimal string with at most 6 significant digits is converted to IEEE 754 single-precision representation, and then converted back to a decimal string with the same number of digits, the final result should match the original string.
This means: x==FloatToStr(StrToFloat(x)) is true if x has at most 6 significant digits and the result or the right hand side has at most 6 significant digits as well.


An example to show the difference between accurate representation, and round-trip data retention:

Consider the value 0x3F800001 which is an IEEE 754 encoded single precision float.

It corresponds to this binary number:
0 01111111 00000000000000000000001

The sign bit is 0, the exponent is 01111111 = 127. Since the exponent is biased by 127 in single precision float format the actual exponent is 127-127 = 0.

Now on to the 3rd binary number, the significand.
Only the right most bit (bit 0) in the significand is set.
bit 0 = 2^-23 * 1 = 0.00000011920928955078125 (exactly)

bit 23 = 2^0 * 1 (implicitly set to 1 for normalized representation)

So the accurate number would be
(1 + 0.00000011920928955078125) * 2^exponent =
1.00000011920928955078125 * 2^0 =
1.00000011920928955078125

Rounding to just 6 digits would make it indistinguishable from 1.0.

Wikipedia (or the referenced paper) claims that:
"If an IEEE 754 single-precision number is converted to a decimal string with at least 9 significant digits, and then converted back to single-precision representation, the final result must match the original number.[5]"

Indeed, when you round "1.00000011920928955078125" to 9 significant digits (1.00000012) it converts to single float format and back to a decimal string correctly.

Some references:
https://www.exploringbinary.com/decimal ... t-numbers/
https://www.exploringbinary.com/maximum ... nt-numbers
https://en.wikipedia.org/wiki/Single-pr ... int_format
Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic (page 4)
https://stackoverflow.com/questions/509 ... -to-string
https://github.com/JackTrapper/Exact-Fl ... g-Routines
https://github.com/rkennedy/exact-float
John Herbsters ExactFloatToStr(x:Extended)
Other useful contributions by John Herbster: https://cc.embarcadero.com/Author/358

https://stackoverflow.com/questions/302 ... r-the-hood

Best explanation and summary:
Good explanation, nice research and literature review (4 papers) on printing floating point numbers, including the reference functions written by David Gay: http://www.ryanjuckett.com/programming/ ... t-numbers/

Two other relevant papers (apparently discussed in the link above):
https://www.cs.indiana.edu/~dyb/pubs/FP ... PLDI96.pdf
Most recent (2010):
https://www.cs.tufts.edu/~nr/cs257/arch ... printf.pdf

Another more recent option used in Swift:
https://github.com/google/double-conversion/issues/27

nneonneo
Posts: 3
Joined: 21 Jul 2018 03:37

Re: Data inspector: round floats to precision/significant digits

Post by nneonneo » 21 Jul 2018 03:46

I'm the one who posted the original message.

I think 6 digits was too low in my initial message; indeed, 23 bits of precision may require ~8 decimal digits to display accurately (then add +1 digit for the implicit 1). But this is only an estimate.

I found a library that implements exact round-trip float<->string conversions with proper rounding and minimal representation length: https://github.com/jwiegley/gdtoa (mirrored from the original gdtoa at http://www.netlib.org/fp/). It's written by the same guy (David M. Gay) who implemented the famous "dtoa" algorithm which is used by many systems for printing floats and doubles (for example, the Python programming language uses it to represent their double-precision floats using the minimal possible representation).

For your sample input, g_ffmt prints 1.0000001, which does indeed return 0x3f800001 when parsed with the provided "strtof" function (and also when using C's `sscanf` with "%f").

I hacked up a quick test program that also serves to demonstrate how to use g_ffmt and strtof:

Code: Select all

#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include "gdtoa.h"

float reinterpret_int(uint32_t val) {
    float res;
    memcpy(&res, &val, 4);
    return res;
}

uint32_t reinterpret_float(float val) {
    uint32_t res;
    memcpy(&res, &val, 4);
    return res;
}

int ffmt(char *buf, unsigned bufsize, float f) {
    if(g_ffmt(buf, &f, 0, bufsize) == NULL) {
        return -1;
    }
    return 0;
}

void test(uint32_t rep1) {
    float f1 = reinterpret_int(rep1);
    char buf[32];
    ffmt(buf, sizeof(buf), f1);
    float f2 = strtof(buf, NULL);
    float f3;
    sscanf(buf, "%f", &f3);
    uint32_t rep2 = reinterpret_float(f2);
    uint32_t rep3 = reinterpret_float(f3);
    printf("0x%08x %s 0x%08x 0x%08x\n", rep1, buf, rep2, rep3);
}

int main() {
    test(0x3f800000u);
    test(0x3f800001u);
    test(0x3f800002u);
    test(0x58aaaaaau);
}
`gdtoa` can also be used with doubles (use g_dfmt/strtod), and will also produce "minimal" representations there as well.

Maël
Site Admin
Posts: 881
Joined: 12 Mar 2005 14:15

Re: Data inspector: round floats to precision/significant digits

Post by Maël » 21 Jul 2018 10:26

Thanks for your feedback, I added some additional references.

Post Reply