Floating point comparisons don't work. Don't event attempt them

12. April 2012 22:23

 

This started because somebody discovered and issue with php. Which turns out to also pop up in other languages like php, javascript and of course python. These magic numbers happen to be an edge case for a double precision floating point number. So it actually happens in all languages. Simply put the number is large enough to start dropping the least significant digits.

 

One of these numbers happen to be 9223372036854775807.0

 

Here is an example of what the problem is.

 

>>> 9223372036854775807 == 9223372036854775808
False
>>> 9223372036854775807.0 == 9223372036854775808
True

 

Obviously you would think that the 2nd should also equal false. However in a floating point number they are actually converted to the same number. So of course they actually appear to be equal. We can show this by doing the following.

 

>>> a = 9223372036854775807.0
>>> b = 9223372036854775808.0
>>> a == b
True
>>> print a
9.22337203685e+18
>>> print b
9.22337203685e+18

 

As you can see the numbers are actually the same. However when you try to compare them some other ways they also break when trying to compare when forcing the type to an int like this.

 

 

>>> int(9223372036854775807) == int(9223372036854775808)
False
>>> int(9223372036854775807.0) == int(9223372036854775808)
True
>>> int(9223372036854775807.0) == 9223372036854775808
True
>>> 9223372036854775807.0 == int(9223372036854775808)
True

Like this

 

 

>>> print int(9223372036854775807)
9223372036854775807
>>> print int(9223372036854775807.0)
9223372036854775808

 

 

However this particular problem does not apply to python alone it does actually apply to anything that is using the standard ieee 64 bit floating point since it is actually impossible to represent the number 9223372036854775807.0 to it gets rounded to the nearest floating point number that happens to be 9223372036854775808

 

We can prove this because it's also acts this way in a C compiler.

 

 

#include <stdio.h>

int main(int argc, char **argv) {
        double a = 9223372036854775807.0;
        double b = 9223372036854775808.0;

        if (a == b)
                printf("True\n");
        else
                printf("False\n");

        return 0;
}

 

And if you take it down to assembler it will also show that it is happening there. However if you look at the raw data in the exe file you will also see that it has actually already truncated the number 9223372036854775807.0 to the same as the other number 9223372036854775808.0

 

Just to make it stick a little more the following is exactly the same issue!

 

>>> a = 9999999999999999.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
>>> b = 9999999999999999.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002
>>> a == b
True

 

Its floating point. Don't attempt to compare them ever to be equal to each other! It doesn't work with large numbers because there is not enough accuracy to store the information.

E-mail Kick it! DZone it! del.icio.us Permalink