Understanding Fast-Math
While adding support for Xcode 13 to our iOS PDF SDK, we stumbled upon an interesting issue in our PDF renderer. Everything worked fine in debug builds, but when using the new developer tools to compile in release configuration, some PDF elements were missing. After a lot of printf
debugging, it became apparent that a floating-point NaN
check started failing, which eventually led us to the -ffast-math
optimization we introduced a few years ago. Luckily, we managed to catch this issue during internal QA before it became a problem for our customers. As we clearly underestimated the risks associated with this optimization, it seemed prudent to take a closer look at it. It turns out that, like with almost anything else relating to IEEE floating-point math, it’s a rabbit hole full of surprising behaviors.
What Is Fast-Math?
-ffast-math
is a compiler flag that enables a set of aggressive floating-point optimizations. The flag is shorthand for a collection of different optimization techniques, each having its own compiler flag. The exact set of optimizations differs between compilers, but it generally includes a set of optimizations that leverage algebraic rules that hold for real numbers, but not necessarily for IEEE floats.
Enabling -ffast-math
will break strict IEEE compliance for your application and could result in changes in behavior. At best, it might affect the precision of computed numbers. At worst, it might significantly affect the program’s branching and produce completely unexpected results.
In Xcode, the fast-math optimization can be enabled with the GCC_FAST_MATH
build setting. You can find it listed as Relax IEEE Compliance in the Xcode Build Settings UI under Apple Clang - Code Generation.
Clang
To better understand the optimizations -ffast-math
enables, we can look at the specific set of behaviors (compiler flags) the option implies when using the Clang compiler:
-
-ffinite-math-only
— Shorthand for-fno-honor-infinities
and-fno-honor-nans
.-
-fno-honor-infinities
— The compiler assumes there are no infinities (+/-Inf
) in the program (neither in the arguments nor in the results of floating-point arithmetic). -
-fno-honor-nans
— The compiler assumes there are noNaN
s in the program (neither in the arguments nor in the results of floating-point arithmetic).
-
-
-fno-math-errno
— Enables optimizations that might cause standard C math functions to not seterrno
. This avoids a write to a thread-local variable and enables inlining of certain math functions. -
funsafe-math-optimizations
— Shorthand for a set of unsafe floating-point optimizations.-
-fassociative-math
— Enables optimizations leveraging the associative property of real numbers, i.e. (x + y) + z => x + (y + z). Due to rounding errors, this algebraic law typically doesn’t apply to IEEE floating-point numbers. -
-freciprocal-math
— Allows division operations to be transformed into multiplication by a reciprocal, e.g. x = a / c; y = b / c; => tmp = 1.0 / c; x = a _ tmp; y = b _ tmp. This can be significantly faster than division, but it can also be less precise. -
-fno-signed-zeros
— Enables optimizations that ignore the sign of floating-point zeros. Without this option, IEEE arithmetic predicts specific behaviors for +0.0 and -0.0 values, which prohibit the simplification of expressions like x+0.0 or 0.0*x. -
-fno-trapping-math
— The compiler assumes floating-point exceptions won’t ever actually invoke a signal handler, which enables speculative execution of floating-point expressions and simple optimizations like this one.
-
-
-ffp-contract=fast
— Enables the use of floating-point contraction instructions, such as fused multiply-add (FMA). In turn, the floating-point contraction instructions combine two separate floating-point operations into a single operation. Those instructions can affect floating-point precision, because instead of rounding after each operation, the processor may round only once after both operations.
With Clang (and GCC), you can assume that -ffast-math
will also be used when specifying the -Ofast
optimization level.
Dealing with Finite Math
One of the more controversial optimizations from the list above is -ffinite-math-only
, together with its two sub-options, -fno-honor-infinities
and -fno-honor-nans
. The official Clang documentation doesn’t go into too much detail and just defines -ffinite-math-only
as allowing “floating-point optimizations that assume arguments and results are not NaN
s or +-Inf.”
This option enables a set of optimizations for arithmetic expressions, which seem intuitive for real numbers but aren’t generally possible when we have to deal with NaN
and Inf
values in floating-point numbers. The option fits well with -fno-signed-zeros
to enable an ever greater set of optimizations. So far, so good.
The controversy starts when we take a look at the behavior of a function like isnan
. How should this check behave when we’re using -ffinite-math-only
? Should it make a real check, or should the compiler just optimize it to false
? With the current definition of this option, we’re essentially telling the compiler there will never be any NaN
s in the program, so it’s technically free to optimize the check to a constant false
.
While this optimization might make sense intuitively, provided you first carefully read the compiler documentation for -ffast-math
and its suboptions, it also causes quite a few problems. For one, it breaks some reasonable workflows where we’d want to validate input data or where NaN
s would be used as memory-efficient sentinels to mark invalid data in floating-point data structures. This is precisely the trap we fell into. Some of our C++ code uses NaN
s to indicate invalid values for PDF primitives. Those values are checked with isnan
, and branching is done accordingly. The code remained working fine for years after we first introduced the -ffast-math
option. But it was always undefined behavior, and all it took was a compiler update to turn it into a regression.
The -ffinite-math-only
optimization also causes inconsistencies where isnan
checks will behave differently if they’re provided by the compiler or a library with different optimization settings (e.g. libc
). There are also other standard APIs that might produce surprising behaviors — e.g. std::numeric_limits<T>::has_quiet_NaN()
might still claim that NaN
s are supported even when the optimization is applied. You could also go so far as to say that double
and double
under -ffinite-math-only
should be considered different types due to the differences in behavior you’d see if your project uses -ffinite-math-only
selectively.
Another way to look at the logic of having NaN
checks optimized out is from a pure performance point of view. It should be fairly safe to assume that code that extensively uses isnan
, and could therefore benefit the most by having NaN
checks removed, is also the code that most likely cares about the correct output of those checks — and therefore can’t use -ffinite-math-only
.
The -ffinite-math-only
option could be made safer if we changed the definition of -ffinite-math-only
to only apply to arithmetic expressions, but otherwise still allow NaN
values. In other words, the assumption of no NaN
s would be applied to mathematical expressions and functions, but not to tests like isnan
. This alternative has been proposed a few times already — most recently in this fairly lengthy llvm-dev mailing list thread. In it, you can see that there are certainly good arguments to be made for either behavior, and at least for now, it appears as though the discussion ended in a stalemate.
Performance Impact
We could have refactored our code to not use NaN
s in this way, or employed a number of workarounds to fix the issue with our isnan
checks, like using integer operations to check the bits corresponding to NaN
, or selectively disabling -ffinite-math-only
in certain files. However, we didn’t do any of that. Instead, we opted to play it safe, and we globally disabled -ffast-math
.
The option was introduced before we had reliable performance tests, so I was curious to see what impact this would have. To my surprise, there were no measurable differences outside of the standard deviation we already see when repeating tests multiple times. This isn’t to say that certain floating-point operations didn’t in fact become slower. They most likely did. However, in our case, they don’t seem to be causing any actual bottlenecks.
Conclusion
As you can see, -ffast-math
is unfortunately not just a harmless optimization that makes your app run faster; it can also effect the correctness of your code. And even if it doesn’t do that right now, it might do that in the future with new compiler revisions.
Unless you see actual performance bottlenecks with your floating-point calculations, it’s best to avoid -ffast-math
. And there’s a good chance it won’t have a significant impact on the performance characteristics of your average program. It didn’t make much of a difference for our renderer, even though it has to deal with a lot of floating-point operations.
If your performance tests do indicate that -ffast-math
makes a difference, then be sure to spend some time auditing your floating-point calculations and control flow to avoid the more obvious pitfalls, such as the use of isnan
and isinf
. In the end, most other issues will still be very hard to notice, so you’ll have to accept a certain amount of risk. For us, the decision was easy — it’s just not worth the trouble.
Matej is a software engineering leader from Slovenia. He began his career freelancing and contributing to open source software. Later, he joined Nutrient, where he played a key role in creating its initial products and teams, eventually taking over as the company’s Chief Technology Officer. Outside of work, Matej enjoys playing tennis, skiing, and traveling.