A number of Python packages can, under certain conditions, return incorrect results for floating-point calculations. Brendan Dolan-Gavitt, who works as an assistant professor at the private Tandon University in New York, found this out. According to his Twitter profile at this university, he primarily deals with topics such as safety and revenge engineering.
Warning message made the problem clear
As he describes in detail in a blog entry, Dolan-Gavitt was initially startled by some unusual warning messages from the NumPy library. The library is used to handle vectors, matrices or generally large multidimensional arrays more easily. These warnings kept popping up when he imported certain Python packages. This may not seem surprising at first glance, as reports of attacks on packages and package managers related to the popular programming language have increased recently.
However, as a mathematician, Dolan-Gavitt was able to verify that the problem revolved around floating-point subnormals in his Python code. Anyone who has tried to explain the use, operation, and arithmetic of floating point numbers in the simplest possible terms will no doubt agree with Dolan-Gavitt’s statement that “Floating point mathematics is notoriously tricky”.
He goes on to say that if anything changes the behavior of the floating point unit (FPU) on the CPU, all sorts of weird problems can arise. Among other things, some numerical algorithms depend on the standard behavior of the FPU. They don’t converge when the FPU is set to treat subnormal/denormal numbers as zero. This is done on x86 systems by setting the FTZ/DAZ flags in the MXCSR register.
The subnormals and a compiler option
According to Dolan-Gavitt, the fundamental problem is that the so-called subnormal numbers – also known as non-normalized floating point numbers – are treated as zero values by the errors. These numbers have a 0 in front of the binary point, while the “normal” floating point numbers always have a 1 at this point. The subnormal numbers offer the guarantee that the addition and subtraction of floating point numbers will never underflow two close floating point numbers. They always have a non-zero representable difference.
Through a search, the researcher found a related GitHub entry that pointed to a shared library running with the gcc/clang option -ffast-math
was compiled and loaded as the cause of the problem. It turned out that even with shared libraries, if this option is activated, the compiler includes a constructor that sets the FTZ/DAZ flags as soon as the library is loaded.
This means that any application that loads this library will change its floating point behavior for the whole process. Also activate the option -Ofast
which at first glance sounds like a “make my program fast” flag, automatically -ffast-math
so some projects turn it on unknowingly without a developer being aware of the implications.
Network library as the cause of the problems
A check of the libraries that Dolan-Gavitt’s program loaded in the Python process revealed that the gevent network library was the cause of the problems. Another search showed that there was already a bug report with an attempt at a fix, but it didn’t work. This is how the current version of gevent on PyPI still messes up the floating point behavior.
The computer scientist then tried to determine which other packages and libraries were affected by this problem. In his detailed blog entry, he describes this search, which turned out to be extensive and difficult. He also gives some recommendations on how developers should deal with this problem, which he estimates could affect around 2500 Python packages. Finally, his request is to change the behavior of the compiler for these cases. However, he also points out that a corresponding bug report for the GCC compiler is now almost 10 years old and this error has not been fixed.