Beware of Fast-Math

304 blobcode 212 5/31/2025, 7:05:57 AM simonbyrne.github.io ↗

Comments (212)

orlp · 1d ago

I helped design an API for "algebraic operations" in Rust: <https://github.com/rust-lang/rust/issues/136469>, which are coming along nicely.

These operations are

1. Localized, not a function-wide or program-wide flag.

2. Completely safe, -ffast-math includes assumptions such that there are no NaNs, and violating that is undefined behavior.

So what do these algebraic operations do? Well, one by itself doesn't do much of anything compared to a regular operation. But a sequence of them is allowed to be transformed using optimizations which are algebraically justified, as-if all operations are done using real arithmetic.

pclmulqdq · 1d ago

-ffast-math is actually something like 15 separate flags, and you can use them individually if you want. 3 of them are "no NaNs," "no infinities," and "no subnormals." Several of the other flags allow you to treat math as associative or distributive if you want that.

The library has some merit, but the goal you've stated here is given to you with 5 compiler flags. The benefit of the library is choosing when these apply.

foota · 16h ago

There's probably a benefit to being able to choose which you want in different places though, right?

glkindlmann · 1d ago

That sounds neat. What would be really neat is if the language helped to expose the consequences of the ensuing rounding error by automating things that are otherwise clumsy for programmers to do manually, like running twice with opposite rounding directions, or running many many times with internally randomized directions (two of the options in Sec 4 of *). That is, it would be cool if Rust enabled people learn about the subtleties of floating point, instead of hiding them away.

* https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf

eqvinox · 1d ago

Are these calls going to clear the FTZ and DAZ flags in the MXCSR on x86? And FZ & FIZ in the FPCR on ARM?

orlp · 1d ago

I don't believe so, no. Currently these operations only set the LLVM flags to allow reassociation, contraction, division replaced by reciprocal multiplication, and the assumption of no signed zeroes.

This can be expanded in the future as LLVM offers more flags that fall within the scope of algebraically motivated optimizations.

eqvinox · 1d ago

Ah sorry I misunderstood and thought this API was for the other way around, i.e. forbidding "unsafe" operations. (I guess the question reverses to setting those flags)

('Naming: "algebraic" is not very descriptive of what this does since the operations themselves are algebraic.' :D)

nextaccountic · 1d ago

> ('Naming: "algebraic" is not very descriptive of what this does since the operations themselves are algebraic.' :D)

Okay, the floating point operations are literally algebraic (they form an algebra) but they don't follow some common algebraic properties like associativity. The linked tracking issue itself acknowledges that:

> Naming: "algebraic" is not very descriptive of what this does since the operations themselves are algebraic.

Also this comment https://github.com/rust-lang/rust/issues/136469#issuecomment...

> > On that note I added an unresolved question for naming since algebraic isn't the most clear indicator of what is going on. > > I think it is fairly clear. The operations allow algebraically justified optimizations, as-if the arithmetic was real arithmetic. > > I don't think you're going to find a clearer name, but feel free to provide suggestions. One alternative one might consider is real_add, real_sub, etc.

Then retorted here https://github.com/rust-lang/rust/issues/136469#issuecomment...

> These names suggest that the operations are more accurate than normal, where really they are less accurate. One might misinterpret that these are infinite-precision operations (perhaps with rounding after a whole sequence of operations). > > The actual meaning isn't that these are real number operations, it's quite the opposite: they have best-effort precision with no strict guarantees. > > I find "algebraic" confusing for the same reason. > > How about approximate_add, approximate_sub?

And the next comment

> Saying "approximate" feels imperfect, as while these operations don't promise to produce the exact IEEE result on a per-operation basis, the overall result might well be more accurate algebraically. E.g.: > > (...)

So there's a discussion going on about the naming

eqvinox · 1d ago

It doesn't feel appropriate to comment there for me not knowing any Rust really, but "lax_" (or "relax_") would have the extra benefit of being very short.

(Is this going to overload operators or are people going to have to type this… a lot… ?)

Sharlin · 1d ago

Rust has some precedence for adding convenience newtypes with overloaded operators (eg. `Wrapping<I>´ for `I.wrapping_add(I)` etc). Such a wrapper isn't currently proposed AFAIK but there's no reason one couldn't be added in the future I believe.

Measter · 20h ago

For giggles, here's one I whipped up, along with an example use: https://godbolt.org/z/Eezj35dzc

eqvinox · 11h ago

Having done unspeakable things with the C preprocessor myself, this Rust macro soup truly warms my heart <3

Measter · 5h ago

Oh this is simple, this is just a bit of basic expansion to generate boilerplate. The real experts can do some gnarly and amazing stuff with macros.

Sharlin · 18h ago

Wow, that's some hardcore unrolling.

eqvinox · 1d ago

Right, as long as the LLVM intrinsics are exposed you could just put that in a crate somewhere.

CryZe · 1d ago

WebAssembly also ended up calling its set of similar instructions relaxed.

evrimoztamur · 1d ago

Does that mean that a physics engine written with these operations will always compile to yield the same deterministic outcomes across different platforms (assuming they correctly implement (or able to do so) algebraic operations)?

Sharlin · 1d ago

It's more like the opposite. These tell the compiler to assume for optimization purposes that floats are associative and so on (ie. algebraic), even when in reality they aren't. So the results may vary depending on what transformations the compiler performs – in particular, they may vary between optimized and non-optimized builds, which normally isn't allowed.

vanderZwan · 1d ago

> These tell the compiler to assume for optimization purposes that floats are associative and so on (ie. algebraic), even when in reality they aren't.

I wonder if it is possible to add an additional constraint that guarantees the transformation has equal or fewer numerical rounding errors. E.g. for floating point doubles (0.2 + 0.1) - 0.1 results in 0.20000000000000004, so I would expect that transforming some (A + B) - B to just A would always reduce numerical error. OTOH, it's floating point maths, there's probably some kind of weird gotcha here as well.

StefanKarpinski · 1d ago

Pretty sure that’s not possible. More accurate for some inputs will be less accurate for others. There’s a very tricky tension in float optimization that the most predictable operation structure is a fully skewed op tree, as in naive left-to-right summation, but this is the slowest and least accurate order of operations. Using a more balanced tree is faster and more accurate (great), but unfortunately which tree shape is fastest depends very much on hardware-specific factors like SIMD width (less great). And no tree shape is universally guaranteed to be fully accurate, although a full binary tree tends to have the best accuracy, but has bad base case performance, so the actual shape that tends to get used in high performance kernels is SIMD-width parallel in a loop up to some fixed size like 256 elements, then pairwise recursive reduction above that. The recursive reduction can also be threaded. Anyway, there’s no silver bullet here.

scythe · 1d ago

I think a restricted version might be possible to implement: only allow transformations if the transformed version has strictly fewer numerical rounding errors on some inputs. This will usually only mean canceling terms and collecting expressions like "x+x+x" into 3x.

In general, rules that allow fewer transformations are probably easier to understand and use. Trying to optimize everything is where you run into trouble.

legobmw99 · 1d ago

Kahan summation is an example (also described in the top level article) of one such “gotcha”. It involves adding a term that - if floats were algebraic in this sense - would always be zero, so ffast-math often deletes it, but this actually completely removes the accuracy improvement of the algorithm

anthk · 1d ago

Under EForth with FP done in software:

2 f 1 f 1 f f+ f- f. 0.000 ok

PFE, I think reusing the GLIBC math library:

2e0 1e0 1e0 f+ f- f. 0.000000 ok

Dylan16807 · 9h ago

Every implementation of floating point can handle small to medium integers without rounding. So that example doesn't show anything we can learn from.

anthk · 7h ago

Buth for Forth you can have a totally working software floating point made from the people whose later would set the IEEE standard. Think about very small microcontrollers without FP in hardware, you can be sure that a software FP implementation will perfectly work under the standard thresolds.

orlp · 1d ago

No, there is no guarantee which (if any) optimizations are applied, only that they may be applied. For example a fused multiply-add instruction may be emitted for a*b + c on platforms which support it, which is not cross-platform.

SkiFire13 · 1d ago

No, the result may depend on how the compiler reorders them, which could be different on different platforms.

smcameron · 1d ago

One thing I did not see mentioned in the article, or in these comments (according to ctrl-f anyway) is the use of feenableexcept()[1] to track down the source of NaNs in your code.

    feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);

will cause your code to get a SIGFPE whenever a NaN crawls out from under a rock. Of course it doesn't work with fast-math enabled, but if you're unknowingly getting NaNs without fast-math enabled, you obviously need to fix those before even trying fast-math, and they can be hard to find, and feenableexcept() makes finding them a lot easier.

[1] https://linux.die.net/man/3/feenableexcept

DavidVoid · 16h ago

Yeah it's pretty useful to enable every once in a while just to see if anything complains.

Be very careful with it in production code though [1]. If you're in a dll then changing the FPU exception flags is a big no-no (unless you're really really careful to restore them when your code goes out of scope).

[1]: https://randomascii.wordpress.com/2016/09/16/everything-old-...

jart · 14h ago

Trapping math is the enlightened way to do things. I wrote an example in the cosmo repo of how to use it. https://github.com/jart/cosmopolitan/blob/master/examples/tr...

emn13 · 1d ago

I get the feeling that the real problem here are the IEEE specs themselves. They include a huge bunch of restrictions that each individually aren't relevant to something like 99.9% of floating point code, and probably even in aggregate not a single one is relevant to a large majority of code segments out in the wild. That doesn't mean they're not important - but some of these features should have been locally opt-in, not opt out. And at the very least, standards need to evolve to support hardware realities of today.

Not being able to auto-vectorize seems like a pretty critical bug given hardware trends that have been going on for decades now; on the other hand sacrificing platform-independent determinism isn't a trivial cost to pay either.

I'm not familiar with the details of OpenCL and CUDA on this front - do they have some way to guarrantee a specific order-of-operations such that code always has a predictable result on all platforms and nevertheless parallelizes well on a GPU?

adrian_b · 1d ago

Not being able to auto-vectorize is not the fault of the IEEE standard, but the fault of those programming languages which do not have ways to express that the order of some operations is irrelevant, so they may be executed concurrently.

Most popular programming languages have the defect that they impose a sequential semantics even where it is not needed. There have been programming languages without this defect, e.g. Occam, but they have not become widespread.

Because nowadays only a relatively small number of users care about computational applications, this defect has not been corrected in any mainline programming language, though for some programming languages there are extensions that can achieve this effect, e.g. OpenMP for C/C++ and Fortran. CUDA is similar to OpenMP, even if it has a very different syntax.

The IEEE standard for floating-point arithmetic has been one of the most useful standards in all history. The reason is that both hardware designers and naive programmers have always had the incentive to cheat in order to obtain better results in speed benchmarks, i.e. to introduce errors in the results with the hope that this will not matter for users, which will be more impressed by the great benchmark results.

There are always users who need correct results more than anything else and it can be even a matter of life and death. For the very limited in scope uses where correctness does not matter, i.e. mainly graphics and ML/AI, it is better to use dedicated accelerators, GPUs and NPUs, which are designed by prioritizing speed over correctness. For general-purpose CPUs, being not fully-compliant with the IEEE standard is a serious mistake, because in most cases the consequences of such a choice are impossible to predict, especially not by the people without experience in floating-point computation who are the most likely to attempt to bypass the standard.

Regarding CUDA, OpenMP and the like, by definition if some operations are parallelizable, then the order of their execution does not matter. If the order matters, then it is impossible to provide guarantees about the results, on any platform. If the order matters, it is the responsibility of the programmer to enforce it, by synchronization of the parallel threads, wherever necessary.

Whoever wants vectorized code should never rely on programming languages like C/C++ and the like, but they should always use one of the programming language extensions that have been developed for this purpose, e.g. OpenMP, CUDA, OpenCL, where vectorization is not left to chance.

emn13 · 22h ago

If you care about absolute accuracy, I'm skeptical you want floats at all. I'm sure it depends on the use case.

Whether it's the standards fault or the languages fault for following the standard in terms of preventing auto-vectorization is splitting hairs; the whole point of the standard is to have predictable and usually fairly low-error ways of performing these operations, which only works when the order of operations is defined. That very aim is the problem; to the extent the stardard is harmless when ordering guarrantees don't exist you're essentially applying some of those tricky -ffast-math suboptimizations.

But to be clear in any case: there are obviously cases whereby order-of-operations is relevant enough and accuracy altering reorderings are not valid. It's just that those are rare enough that for many of these features I'd much prefer that to be the opt-in behavior, not opt-out. There's absolutely nothing wrong with having a classic IEEE 754 mode and I expect it's an essentialy feature in some niche corner cases.

However, given the obviously huge application of massively parallel processors and algorithms that accept rounding errors (or sometimes conversely overly precise results!), clearly most software is willing to generally accept rounding errors to be able to run efficiently on modern chips. It just so happens that none of the computer languages that rely on mapping floats to IEEE 754 floats in a straitforward fashion are any good at that, which is seems like its a bad trade off.

There could be multiple types of floats instead; or code-local flags that delineate special sections that need precise ordering; or perhaps even expressions that clarify how much error the user is willing to accept and then just let the compiler do some but not all transformations; and perhaps even other solutions.

alfiedotwtf · 15h ago

> Most popular programming languages have the defect that they impose a sequential semantics even where it is not needed. There have been programming languages without this defect, e.g. Occam, but they have not become widespread.

We have memory ordering functions to let compilers know the atomic operation preference of the programmer… couldn’t we do the same for maths and in general a set of expressions?

adrian_b · 9h ago

An example of programming language syntax that avoids to specify sequential execution where not needed is to specify that a sequence of expressions separated by semicolons must be executed sequentially, but a sequence of expressions separated by commas may be executed in any order or concurrently.

This is just a minor change from the syntax of the most popular programming languages, because they typically already specify that the order of evaluation of the expressions used for the arguments of a function, which are separated by commas, can be arbitrary.

Early in its history, the C language has been close to specifying this behavior for its comma operator, but unfortunately its designers have changed their mind and they have made the comma operator behave like a semicolon, in order to be able to use it inside for statement headers, where the semicolons have a different meaning. A much better solution for C, instead of making both comma and semicolon to have the same behavior, would have been to allow a block to appear in any place where an expression is expected, giving it the value of the last expression evaluated in the block.

dzaima · 21h ago

The precise requirements of IEEE-754 may not be important for any given program, but as long as you want your numbers to have any form of well-defined semantics beyond "numbers exist, and here's a list of functions that do Something™ that may or may not be related to their name", any number format that's capable of (approximately) storing both 10^20 and 10^-20 in 64 bits is gonna have those drawbacks.

AFAIK GPU code is basically always written as scalar code acting on each "thing" separately, that's, as a whole, semantically looped over by the hardware, same way as multithreading would (i.e. no order guaranteed at all), so you physically cannot write code that'd need operation reordering to vectorize. You just can't write an equivalent to "for (each element in list) accumulator += element;" (or, well, you can, by writing that and running just one thread of it, but that's gonna be slower than even the non-vectorized CPU equivalent (assuming the driver respects IEEE-754)).

adrian_b · 9h ago

A CUDA "kernel" is the same thing as what has been called "parallel DO" or "parallel FOR" since 1963, or perhaps even earlier.

This is slightly obfuscated by not using a keyword like "for" or "do", by the fact that the body of the loop (the "kernel") is written in one place and and the header of the loop (which gives the ranges for the loop indices) is written in another place, and by the fact that the loop indices have standard names.

A "parallel for" may have as well a syntax identical with a sequential "for". The difference is that for the "parallel for" the compiler knows that the iterations are independent, so they may be scheduled to be executed concurrently.

NVIDIA has been always greatly annoying by inventing a huge amount of new terms that are just new words for old terms that have been used for decades in the computing literature, with no apparent purpose except of obfuscating how their GPUs really work. Worse, AMD has imitated NVIDIA, by inventing their own terms that correspond to those used by NVIDIA, but they are once again different.

anthk · 4h ago

xargs does a parallel for too. And OFC Forth people might did that too in a breeze.

adrian_b · 1h ago

That's right and the same is done by the improved version of xargs, GNU "parallel".

Affric · 1d ago

How does IEEE 754 prevent auto-vectorisation?

dahart · 1d ago

The spec doesn’t prevent auto-vectorization, it only says the language should avoid it when it wants to opt in to producing “reproducible floating-point results” (section 11 of IEEE 754-2019). Vectorizing can be implemented in different ways, so whether a language avoids vectorizing in order to opt in to reproducible results is implementation dependent. It also depends on whether there is an option to not vectorize. If a language only had auto-vectorization, and the vectorization result was deterministic and reproducible, and if the language offered no serial mode, this could adhere to the IEEE spec. But since C++ (for example) offers serial reductions in debug & non-optimized code, and it wants to offer reproducible results, then it has to be careful about vectorizing without the user’s explicit consent.

kzrdude · 1d ago

If you write a loop `for x in array { sum += x }` Then your program is a specification that you want to add the elements in exactly that order, one by one. Vectorization would change the order.

dahart · 1d ago

The bigger problem there is the language not offering a way to signal the author’s intent. If an author doesn’t care about the order of operations in a sum, they will still write the exact same code as the author who does care. This is a failure of the language to be expressive enough, and doesn’t reflect on the IEEE spec. (The spec even does suggest that languages should offer and define these sorts of semantics.) Whether the program is specifying an order of operations is lost when the language offers no way for a coder to distinguish between caring about order and not caring. This is especially difficult since the vast majority of people don’t care and don’t consider their own code to be a specification on order of operations. Worse, most people would even be surprised and/or annoyed if the compiler didn’t do certain simplifications and constant folding, which change the results. The few cases where people do care about order can be extremely important, but they are rare nonetheless.

stingraycharles · 1d ago

Yup, because of the imprecision of floating points, cannot just assume that “(a + c) + (b + d)” is the same as “a + b + c + d”.

It would be pretty ironic if at some point fixed point / bignum implementations end up being faster because of this.

anthk · 1d ago

They are, just check anything fixed-point for the 486SX vs anything floating under a 486DX. It's faster scaling and sum and print the desired precision than operating on floats.

einpoklum · 18h ago

I wonder... couldn't there just be some library type for this, e.g. `associative::float` and `associative::doube` and such (in C++ terms), so that compilers can ignore non-associativity for actions on values of these types? Or attributes one can place on variables to force assumption of associativity?

Kubuxu · 1d ago

IIRC reordering additions can cause the result to change which makes auto-vectorisation tricky.

goalieca · 1d ago

Floating point arithmetic is neither commutative or associative so you shouldn’t.

lo0dot0 · 1d ago

While it technically correct to say this it also gets the wrong point across because it leaves out the fact that ordering changes create only a small difference. Other examples where arithmetic is not commutative, e.g. matrix multiplication , can create much larger differences.

kstrauser · 20h ago

> ordering changes create only a small difference.

That can’t be assumed.

You can easily fall into a situation like:

  total = large_float_value
  for _ in range(1_000_000_000):
    total += .01
  assert total == large_float_value

Without knowing the specific situation, it’s impossible to say whether that’s a tolerably small difference.

layer8 · 1d ago

IEEE-754 addition and multiplication is commutative. It isn't distributive, though.

eapriv · 1d ago

Why is it not commutative?

layer8 · 1d ago

It actually is commutative according to IEEE-754, except that in the case of a NaN result you might get a different NaN representation.

adgjlsfhk1 · 18h ago

having multiple NaNs and no spec for how they should behave feels like such an unforced error to me

layer8 · 15h ago

For mathematical use, NaN payloads shouldn’t matter, and behave identically (aside from quiet vs. signaling NaNs). It also doesn’t matter for equality comparison, because NaNs always compare unequal.

adgjlsfhk1 · 3h ago

from the user perspective it's not too bad, but from the compiler perspective it is. The result of this is that LLVM has decided that trying to figure out which nan you got (e.g. by casting to an Int and comparing) is UB, which means pretty much every floating point operation becomes non-deterministic.

This also adds extra complexity to the CPU. you need special hardware for == rather than just using the perfectly good integer unit, and every fpu operation needs to devote a bunch of transistors to handling this nonsense that buys the user absolutely nothing.

there are definitely things to criticize about the design of Posits, but the thing they 100% get right is having a single NaN and sane ordering semantics

ajross · 1d ago

> I get the feeling that the real problem here are the IEEE specs themselves.

Well, all standards are bad when you really get into them, sure.

But no, the problem here is that floating point code is often sensitive to precision errors. Relying on rigorous adherence to a specification doesn't fix precision errors, but it does guarantee that software behavior in the face of them is deterministic. Which 90%+ of the time is enough to let you ignore the problem as a "tuning" thing.

But no, precision errors are bugs. And the proper treatment for bugs is to fix the bugs and not ignore them via tricks with determinism. But that's hard, as it often involves design decisions and complicated math (consider gimbal lock: "fixing" that requires understanding quaternions or some other orthogonal orientation space, and that's hard!).

So we just deal with it. But IMHO --ffast-math is more good than bad, and projects should absolutely enable it, because the "problems" it discovers are bugs you want to fix anyway.

chuckadams · 1d ago

> (consider gimbal lock: "fixing" that requires understanding quaternions or some other orthogonal orientation space, and that's hard!)

Or just avoiding gimbal lock by other means. We went to the moon using Euler angles, but I don't suppose there's much of a choice when you're using real mechanical gimbals.

ajross · 1d ago

That is the "tuning" solution. And mostly it works by limiting scope of execution ("just don't do that") and if that doesn't work by having some kind of recovery method ("push this button to reset", probably along with "use this backup to recalibrate"). And it... works. But the bug is still a bug. In software we prefer more robust techniques.

FWIW, my memory is that this was exactly what happened with Apollo 13. It lost its gyro calibration after the accident (it did the thing that was the "just don't do that") and they had to do a bunch of iterative contortions to recover it from things like the sun position (because they couldn't see stars out the iced-over windows).

NASA would have strongly preferred IEEE doubles and quaternions, in hindsight.

Sharlin · 1d ago

> -funsafe-math-optimizations

What's wrong with fun, safe math optimizations?!

keybored · 1d ago

Hah! I was just about to comment that I immediately read it as fun-safe, everytime I see it.

I guess that happens when I don’t deal with compiler flags daily.

vardump · 1d ago

”This roller coaster is optimized to be Fun and Safe!”

Sharlin · 1d ago

Many funroll loops in that coaster.

storus · 1d ago

This problem is happening even on Apple MPS with PyTorch in deep learning, where fast math is used by default in many operations, leading to a garbage output. I hit it recently while training an autoregressive image generation model. Here is a discussion by folks that hit it as well:

https://github.com/pytorch/pytorch/issues/84936

Sophira · 1d ago

Previously discussed at https://news.ycombinator.com/item?id=29201473 (which the article itself links to at the end).

anthk · 1d ago

On Forth, there's the philosophy of the fixed point:

https://www.forth.com/starting-forth/5-fixed-point-arithmeti...

With 32 and 64 bit numbers, you can just scale decimals up. So, Torvalds was right. On dangerous contexts (uper-precise medical doses, FP has good reasons to exist, and I am not completely sure).

Also, both Forth and Lisp internally suggest to use represented rationals before floating point numbers. Even toy lisps from https://t3x.org have rationals too. In Scheme, you have both exact->inexact and inexact->exact which convert rationals to FP and viceversa.

If you have a Linux/BSD distro, you may already have Guile installed as a dependency.

Hence, run it and then:

      scheme@(guile-user)> (inexact->exact 2.5)
      $2 = 5/2

      scheme@(guile-user)> (exact->inexact (/ 5 2))
      $3 = 2.5

Thus, in Forth, I have a good set of q{+,-,*,/} operations for rational (custom coded, literal four lines) and they work great for a good 99% of the cases.

As for irrational numbers, NASA used up 16 decimals, and the old 113/355 can be precise enough for a 99,99 of the pieces built in Earth. Maybe not for astronomical distances, but hey...

In Scheme:

         scheme@(guile-user)> (exact->inexact (/ 355 113))
         $5 = 3.1415929203539825

In Forth, you would just use

         : pi* 355 133 m*/ ;

with a great precision for most of the objects being measured against.

AlotOfReading · 1d ago

Floats are fixed point, just done in log space. The main change is that the designers dedicated a few bits to variable exponents, which introduces alignment and normalization steps before/after the operation. If you don't mix exponents, you can essentially treat it as identical to a lower precision fixed point system.

anthk · 1d ago

No, not even close. Scaling integers to mimic decimals under 32 and 64 bit can be much faster. And with 32 bit double numbers you can cover Plank numbers, so with 64 bit double numbers you can do any field.

eqvinox · 1d ago

Those rational numbers fly out the window as soon as your math involves any kind of more complicated trigonometry, or even a square root…

stassats · 1d ago

You can turn them back into rationals, (rational (sqrt 2d0)) => 6369051672525773/4503599627370496

Or write your own operations that compute to the precision you want.

anthk · 1d ago

My post already covered inexact->exact:

     scheme@(guile-user)> (inexact->exact (sqrt 2.0))

$1 = 6369051672525773/4503599627370496

s9 Scheme fails on this as it's an irrational number, but the rest of Schemes such as STKlos, Guile, Mit Scheme, will do it right.

With Forth (and even EForth if the images it's compiled with FP support), you are on your own to check (or rewrite) an fsqrt function with an arbitrary precision.

Also, on trig, your parent commenter should check what CORDIC was.

https://en.wikipedia.org/wiki/CORDIC

dreamcompiler · 1d ago

If you want high precision trig functions on rationals, nothing's stopping you from writing a Taylor series library for them. Or some other polynomial appromation or a lookup table or CORDIC.

anthk · 1d ago

Check CORDIC, please.

https://en.wikipedia.org/wiki/CORDIC

Also, on sqrt functions, even a FP-enabled toy EForth under the Subleq VM (just as a toy, again, but it works) provides some sort of fsqrt functions:

    2 f fsqrt f.
    1.414 ok

Under PFE Forth, something 'bigger':

   40 set-precision ok  
   2e0 fsqrt f. 1.4142135623730951454746218587388284504414 ok

EForth's FP precision it's tiny but good enough for very small microcontrollers. But it wasn't so far from the exponents the 80's engineers worked to create properly usable machinery/hardware and even software.

teleforce · 1d ago

“Nothing brings fear to my heart more than a floating point number.” - Gerald Jay Sussman

Is there any IEEE standards committee working on FP alternative for examples Unum and Posit [1],[2].

[1] Unum & Posit:

https://posithub.org/about

[2] The End of Error:

https://www.oreilly.com/library/view/the-end-of/978148223986...

kvemkon · 1d ago

I'm wondering, why there are still no announcements for hardware support of such approaches in CPUs.

vanderZwan · 9h ago

Gustavson's last presentation starts with him holding an actual piece of hardware supporting posits, fwiw.

https://m.youtube.com/watch?v=vzVlQhaAZtQ

neepi · 18h ago

HP had proper deterministic decimal arithmetic since the 1970s.

Q6T46nT668w6i3m · 1d ago

Is this sarcasm? If not, the proposed posit standard, IEEE P3109.

pclmulqdq · 1d ago

The current P3109 draft has no posits in it.

teleforce · 1d ago

Great, didn't know that it exists.

chuckadams · 1d ago

I haven't worked with C in nearly 20 years and even I remember warnings against -ffast-math. It really ought not to exist: it's just a super-flag for things like -funsafe-math-optizations, and the latter makes it really clear that it's, well, unsafe (or maybe it's actually funsafe!)

cycomanic · 1d ago

I think this article overstates the importance of the problems even for scientific software. In the scientific code I've written, noise processes are often orders of magnitude larger than what what is discussed here and I believe this applies to many (most?) simulations modelling the real world (i.e. Physics chemistry,..). At the same time enabling fast-math has often yielded a very significant (>10%) performance boost.

I particularly find the discussion of - fassociative-math because I assume that most writers of some code that translates a mathetical formula to into simulations will not know which would be the most accurate order of operations and will simply codify their derivation of the equation to be simulated (which could have operations in any order). So if this switch changes your results it probably means that you should have a long hard look at the equations you're simulating and which ordering will give you the most correct results.

That said I appreciate that the considerations might be quite different for libraries and in particular simulations for mathematics.

londons_explore · 1d ago

It would be nice if there was some syntax for "math order matters, this is the order I want it done in".

Then all other math will be fast-math, except where annotated.

hansvm · 1d ago

The article mentioned that gcc and clang have such extensions. Having it in the language is nice though, and that's the approach Zig took.

sfn42 · 23h ago

I thought most languages have this? If you simply write a formula operations are ordered according to the language specifiction. If you want different ordering you use parentheses.

Not sure how that interacts with this fast math thing, I don't use C

kstrauser · 20h ago

That’s a different kind of ordering.

Imagine a function like Python’s `sum(list)`. In abstract, Python should be able to add those values in any order it wants. Maybe it could spawn a thread so that one process sums the first half in the list, another sums the second half at the same time, and then you return the sum of those intermediate values. You could imagine a clever `sum()` being many times faster, especially using SIMD instructions or a GPU or something.

But alas, you can’t optimize like that with common IEEE-754 floats and expect to get the same answer out as when using the simple one-at-a-time addition. The result depends on what order you add the numbers together. Order them differently and you very well may get a different answer.

That’s the kind of ordering we’re talking about here.

DavidVoid · 16h ago

It matters for reproducibility between software versions, right?

I work in audio software and we have some comparison tests that compare the audio output of a chain of audio effects with a previous result. If we make some small refactoring of the code and the compiler decides to re-organize the arithmetic operations then we might suddenly get a slightly different output. So of course we disable fast-math.

One thing we do enable though, is flushing denormals to zero. That is predictable behavior and it saves some execution time.

recursivecaveat · 12h ago

Yeah that is the killer for me. I'm not particularly attached to IEEE semantics. Unfortunately the replacement is that your results can change between any two compiles, for nearly any reason. Even if you think you don't care about tiny precision variances: consider that if you ever score and rank things with an algorithm that involves floats, the resulting order can change.

on_the_train · 1d ago

I worked in cad, robotics and now semiconductor optics. In every single field, floating precision down to the very last digits was a huge issue

AlotOfReading · 20h ago

"precision" is an ambiguous term here. There's reproducibility (getting the same results every time), accuracy (getting as close as possible to same results computed with infinite precision), and the native format precision.

ffast-math is sacrificing both the first and the second for performance. Compilers usually sacrifice the first for the second by default with things like automation fma contraction. This isn't a necessary trade-off, it's just easier.

There's very few cases where you actually need accuracy down to the ULP though. No robot can do anything meaningful with femtometer+ precision, for example. Instead you choose a development balance between reproducibility (relatively easy) and accuracy (extremely hard). In robotics, that will usually swing a bit towards reproducibility. CAD would swing more towards accuracy.

cycomanic · 1d ago

Interesting, I stand corrected. In most of the fields I'm aware off one could easily work in 32bit without any issues.

I find the robotics example quite surprising in particular. I think the precision of most input sensors is less than 16bit so. If your inputs have this much noise on them how come you need so much precision your calculations?

spookie · 1d ago

The precision isn't uniform across a range of possible inputs. This means you need a higher bit depth, even though "you aren't really using it", just so you can establish a good base precision you are sure you are hitting at every range. The part where you are saying "most sensors" is doing a lot of leverage here.

zinekeller · 1d ago

(2021)

Previous discussion: Beware of fast-math (Nov 12, 2021, https://news.ycombinator.com/item?id=29201473)

datameta · 1d ago

Luckily outside of mission critical systems, like in demoscene coding, I can happily use "44/7" as a 2pi approximation (my beloved)

Affric · 1d ago

For non-associativity what is the best way to order operations? Is there an optimal order for precision whereby more similar values are added/multiplied first?

EDIT: I am now reading Goldberg 1991

Double edit: Kahan Summation formula. Goldberg is always worth going back to.

zokier · 1d ago

Herbie can optimize arbitrary floating point expressions for accuracy

https://herbie.uwplse.org/

hyghjiyhu · 1d ago

One thing I wonder is what happens if you have an inline function in a header that is compiled with fast math by one translation unit and without in another.

jmb99 · 15h ago

I haven’t checked, but my assumption is that the output of each compilation unit will be different. The one definition rule doesn’t apply here (there’s still one definition), and there shouldn’t be a conflict if the functions are inlined in their respective compilation units so the linker shouldn’t complain.

Could be wrong but that’s my gut feeling.

cbarrick · 23h ago

This page consistently crashes on Vivaldi for Android.

Vivaldi 7.4.3691.52

Android 15; ASUS_AI2302 Build/AQ3A.240812.002

boulos · 22h ago

I've also come around to --ffast-math considered harmful. It's useful though to help find optimization opportunities, but in the modern (AVX2+) world, I think the risks outweigh the benefits.

I'm surprised by the take that FTZ is worse than reassociation. FTZ being environmental rather than per instruction is certainly unfortunate, but that's true of rounding modes generally in x86. And I would argue that most programs are unprepared to handle subnormals anyway.

By contrast, reassociation definitely allows more optimization, but it also prohibits you from specifying the order precisely:

> Allow re-association of operands in series of floating-point operations. This violates the ISO C and C++ language standard by possibly changing computation result.

I haven't followed standards work in forever, but I imagine that the introduction of std::fma, gets people most of the benefit. That combined with something akin to volatile (if it actually worked) would probably be good enough for most people. Known, numerically sensitive code paths would be carefully written, while the rest of the code base can effectively be "meh, don't care".

JKCalhoun · 1d ago

> Even compiler developers can't agree.

> This is perhaps the single most frequent cause of fast-math-related StackOverflow questions and GitHub bug reports

The second line above should settle the first.

layer8 · 1d ago

The first line points out that it doesn't, even if one thinks that it should. Also, note the "perhaps".

eqvinox · 1d ago

I wish the Twitter links in this article weren't broken.

genewitch · 1d ago

Change X to xcancel

Smaug123 · 1d ago

They aren't, at least for the spot-check I performed; probably you need to be logged in.

eqvinox · 1d ago

All it says is "Something went wrong. Try reloading." — no indication having an account logged in would help (…and I don't feel like creating an account just to check…)

SunlitCat · 1d ago

Maybe an unpopular opinion, but having to be logged in, is being broken. ;)

leephillips · 20h ago

This part was fascinating:

“The problem is how FTZ actually implemented on most hardware: it is not set per-instruction, but instead controlled by the floating point environment: more specifically, it is controlled by the floating point control register, which on most systems is set at the thread level: enabling FTZ will affect all other operations in the same thread.

“GCC with -funsafe-math-optimizations enables FTZ (and its close relation, denormals-are-zero, or DAZ), even when building shared libraries. That means simply loading a shared library can change the results in completely unrelated code, which is a fun debugging experience.”

quotemstr · 1d ago

All I want for Christmas is a programming language that uses dependant typing to make floating point precision part of the type system. Catastrophic cancellation should be a compiler error if you assign the output to a float with better ulps than you get with worst case operands.

thesuperbigfrog · 1d ago

Ada might have what you want:

https://www.jviotti.com/2017/12/05/an-introduction-to-adas-s...

http://www.ada-auth.org/standards/22rm/html/RM-3-5-7.html

http://www.ada-auth.org/standards/22rm/html/RM-A-5-3.html

Ada also has fixed point types:

http://www.ada-auth.org/standards/22rm/html/RM-3-5-9.html

dirtyhippiefree · 23h ago

I’m stunned by the following admission: “If fast-math was to give always the correct results, it wouldn’t be fast-math”

If it’s not always correct, whoever chooses to use it chooses to allow error…

Sounds worse than worthless to me.

razighter777 · 1d ago

The worst thing that strikes fear into me is seeing floating points used for real world currency. Dear god. So many things can go wrong. I always use unsigned integers counting number of cents. And if I gotta handle multiple currencies, then I'll use or make a wrapper class.

jksflkjl3jk3 · 1d ago

Floating point math shouldn't be that scary. The rules are well defined in standards, and for many domains are the only realistic option for performance reasons.

I've spent most of my career writing trading systems that have executed 100's of billions of dollars worth of trades, and have never had any floating point related bugs.

Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

usefulcat · 1d ago

You can certainly make trading systems that work using floating point, but there are just so many fewer edge cases to consider when using fixed point.

With fixed point and at least 2 decimal places, 10.01 + 0.01 is always exactly equal to 10.02. But with FP you may end up with something like 10.0199999999, and then you have to be extra careful anywhere you convert that to a string that it doesn't get truncated to 10.01. That could be logging (not great but maybe not the end of the world if that goes wrong), or you could be generating an order message and then it is a real problem. And either way, you have to take care every time you do that, as opposed to solving the problem once at the source, in the way the value is represented.

> Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

In the case of HFT, this would have to depend very greatly on the particulars. I know the systems I write are almost never limited by arithmetical operations, either FP or integer.

gamescr · 23h ago

I work on game engines and the problem with floats isn't on small values like 10.01 but on large ones like 400,010.01 that's when the precision wildly varies.

osigurdson · 22h ago

The issue with floats is the mental model. The best way to think about them is like a ruler with many points clustered around 0 and exponentially fewer as the magnitude grows. Don't think of it like a real value - assume that there are hardly any values represented with perfect precision. Even "normalish" numbers like 10.1 are not in the set actually. When values are converted to strings, even in debuggers sometimes, they are often rounded which throws people off further ("hey, the value is exactly 10.1 - it is right there in the debugger"). What you can count on however is that integers are represented with perfect precision up to a point (e.g. 2^53 -1 for f64).

The other "metal model" issue is that associative operations in math. Adding a + (b + c) != (a + b) + c due to rounding. This is where fp-precise vs fp-fast comes in. Let's not talk about 80 bit registers (though that used to be another thing to think about).

malfist · 23h ago

Not only that but the precision loss accumulates. Multiply too many numbers with small inaccuracies and you wind up with numbers with large inaccuracies

01HNNWZ0MV43FF · 20h ago

Lua is telling me 0.1 + 0.1 == 0.2, but 0.1 + 0.2 != 0.3. That's 64-bit precision. The issue is not with precision, but with 1/10th being a repeating decimal in binary.

anthk · 18h ago

Not an issue on Scheme and Common Lisp and even Forth operating directly with rationals with custom words.

kolbe · 1d ago

It depends on what you're doing. If your system is a linear regression on 30 features, you should probably use floating point. My recollection is fixed is prohibitively slower and with far less FOSS support.

phendrenad2 · 1d ago

I'm wondering if trading systems would run into the same issues as a bank or scientific calculation. You might not be making as many repeated calculations, and might not care if things are "off" by a tiny amount, because you're trading between money and securities, and the "loss" is part of your overhead. If a bank lost $0.01 after every 1 million transactions it would be a minor scandal.

usefulcat · 23h ago

Personally, I would be more concerned about something like determining whether the spread is more than a penny. Something like:

    if (ask - bid > 0.01) {
        // etc
    }

With floating point, I have to think about the following questions: * What if the constant 0.01 is actually slightly greater than mathematical 0.01? * What if the constant 0.01 is actually slightly less than mathematical 0.01? * What if ask - bid is actually slightly greater than the mathematical result? * What if ask - bid is actually slightly less than the mathematical result?

With floating point, that seemingly obvious code is anything but. With fixed point, you have none of those problems.

Granted, this only works for things that are priced in specific denominations (typically hundredths, thousandths, or ten thousandths), which is most securities.

CamperBob2 · 22h ago

So the spread is 0.0099999 instead of 0.01. When will that difference matter?

usefulcat · 22h ago

It matters if the strategy is designed to do very different things depending on whether or not the offers are locked (when bid == ask, or spread is less than 0.01).

In this example, I’m talking about securities that are priced in whole cents. If you represent prices as floats, then it’s possible that the spread appears to be less (or greater) than 0.01 when it’s actually not, due to the inability of floats to exactly represent most real numbers.

CamperBob2 · 22h ago

But I'm still not understanding the real-world consequences. What will those be, exactly? Any good examples or case studies to look at?

usefulcat · 16h ago

Many trading strategies operate on very thin margins. Most of the time it's less than one cent per share, often as little as a tenth of a cent per share or less.

A different example: let's say that you're trying to buy some security, and you've determined that the maximum price you can pay and still be profitable is 10.01. If you mistakenly use an order price of 10.00, you'll probably get fewer shares than you wanted, possibly none. If you mistakenly use a price of 10.02, you may end up paying too much and then that trade ends up not being profitable. If you use a price of 10.0199999 (assuming it's even possible to represent such a price via whatever protocol you're using), either your broker or the exchange will likely reject the order for having an invalid price.

ljosifov · 20h ago

I can imagine sth like: if (bid ask blah blah) { send order to buy 10 million of AAPL; }

T0Bi · 1d ago

> Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

May I ask why? (generally curious)

jcranmer · 1d ago

For starters, it's giving up a lot of performance, since fixed-point isn't accelerated by hardware like floating-point is.

rendaw · 23h ago

Isn't fixed point just integer?

Athas · 22h ago

Yes, but you're not going to have efficient transcendental functions implemented in hardware.

rendaw · 22h ago

Ah okay, fair enough. But what sort of transcendental functions would you use for HFT?

I guess I understood GGGGP's comment about using fixed point for interacting with currency to be about accounting. I'd expect floating point to be used for trading algorithms, but that's mostly statistics and I presume you'd switch back to fixed point before making trades etc.

mitthrowaway2 · 23h ago

Yes, integer combined with bit-shifts.

Athas · 22h ago

The problem with fixed point is in its, well, fixed point. You assign a fixed number of bits to the fractional part of the number. This gives you the same absolute precision everywhere, but the relative precision (distance to the next highest or lowest number) is worse for small numbers - which is a problem, because those tend to be pretty important. It's just overall a less efficient use of the bit encoding space (not just performance-wise, but also in the accuracy of the results you get back). Remember that fixed point does not mean absence of rounding errors, and if you use binary fixed point, you still cannot represent many decimal fractions such as 0.1.

anthk · 22h ago

With fixed point you either scale it up or use rationals.

osigurdson · 22h ago

Fundamentally there is uncertainty associated with any physical measurement which is usually proportional to the magnitude being measured. As long as floating point is << this uncertainty results are equally predictive. Floating point numbers bake these assumptions in.

f33d5173 · 22h ago

It's the front of house/back of house distinction. Front of house should use fixed point, back of house should use floating point. Unless you're doing trading, you want really strict rules with regards to rounding and such, which are going to be easier to achieve with fixed point.

pasc1878 · 21h ago

I don't think it is that clear. The split I think is between calculating settlement amounts which lead to real transfers of money and so should be fixed point whilst risk, pricing (thus trading) and valuation use models which need many calculations so need to be floating point.

eddd-ddde · 1d ago

How do you handle the lack of commutativity? I've always wondered about the practical implications.

jakevoytko · 1d ago

I asked an ex-Bloomberg coder this question once after he told me he used floating points to represent currency all the time, and his response was along the lines of “unless you have blindingly-obvious problems like doing operations on near-zero numbers against very large numbers, these calculations are off by small amounts on their least-significant digits. Why would you waste the time or the electricity dealing with a discrepancy that’s not even worth the money to fix?”

jcranmer · 1d ago

Floating-point is completely commutative (ignoring NaN payloads).

It's the associativity law that it fails to uphold.

BeetleB · 1d ago

Nitpick: FP arithmetic is commutative. It's not associative.

kolbe · 1d ago

All your price field messages are sent to the exchange and back via fixed point, so you are using fixed point for at least some of the process (unless you're targeting those few crypto exchanges that use fp prices).

If you need to be extremely fast (like fpga fast), you don't waste compute transforming their fixed point representation into floating.

djrj477dhsnv · 23h ago

Sure, string encodings are used for most APIs and ultra HFT may pattern match on the raw bytes, but for regular HFT if you're doing much math, it's going to be floating point math.

kolbe · 59m ago

We might have different definitions of "HFT"

simonw · 1d ago

I've been having an interesting challenge relating to this recently. I'm trying to calculate costs for LLM usage, but the amounts of money involved are so tiny. Gemini 1.5 Flash 8B is $0.0375 per million tokens!

Should I be running my accounting system on units of 10 billionths of a dollar?

scott_w · 1d ago

Fixed point Decimal is your friend here. I’m guessing you buy tokens in increments of 1,000,000 so it isn’t too much of an issue to account for. You can then normalise in your accounting so 1,000,000 is just “1 unit,” or you can just account in increments of 1,000,000 but that does start looking weird (but might be necessary!)

Filligree · 23h ago

No, billing happens per-token. It’s entirely necessary to use billionths of a dollar here, if you don’t use floating point.

scott_w · 20h ago

In which case, I’d look at this thread https://news.ycombinator.com/item?id=44145263

marcosdumay · 1d ago

Accounting happens on the unities people pay, not the ones that generate expenses.

But you probably should run your billing in fixed point or floating decimals with a billionth of a dollar precision, yes. Either that or you should consolidate the expenses into larger bunches.

outurnate · 1d ago

You're better off representing values as rationals; a ratio between two different numbers. For example, 0.0375 would be represented as 375 over 10000, or 3 over 80

simonw · 1d ago

Sounds hard to model in SQLite.

teaearlgraycold · 1d ago

Two columns?

anthk · 22h ago

From Forth, here's how I'd set the rationals:

    : gcd begin dup while tuck mod repeat drop ;
    : lcm 2dup \* abs -rot gcd / ;
    : reduce 2dup gcd tuck / >r / r> ;
    : q+ rot 2dup \* >r rot \* -rot \* + r> reduce ;
    : q- swap negate swap q+ ;
    : q\* rot \* >r  \* r> reduce ;
    : q/ >r \* swap r> \* swap reduce ;

Example: to compute 70 * 0.25 = 35/2

70 1 1 4 q* reduce .s 35 2 ok

On stack managing words like 2dup, rot and such, these are easily grasped under either Google/DDG or any Forth with the words "see" and/or "help".

as a hint, q- swaps the top two numbers in the stack, (which compose a rational), makes the last one negative and then turns back its position. And then it calls q+.

So, 2/5 - 3/2 = 2/5 + -3/2.

klysm · 1d ago

Convert to money as late as possible

roryirvine · 22h ago

This is surely the right answer: simply count the number of tokens used, and do the billing reconciliation as a separate step.

As an added benefit, it makes it much easier to deal with price changes.

latchkey · 1d ago

Ethereum is 1e18 or 1 wei.

https://ethereum.stackexchange.com/questions/158517/does-sol...

kolbe · 1d ago

I've used Auroa Units to do this. You can define the dollars dimension, and then all the nano-micro-whatever scale comes with.

scott_w · 1d ago

For far too many years I had inherited a billing system that used floats for all calculations then rounded up or down. Also doing some calculations in JS and mirroring them on the Python backend, so “just switch to Decimal” wasn’t an easy change to make…

jcranmer · 1d ago

I've found fear of the use of floating-point in finance to be a good litmus test for how knowledgeable people are about floating-point. Because as far as I can tell, finance people almost exclusively uses (binary) floating-point [1], whereas a lot of floating-point FUD focuses on how disastrous it is for finance. And honestly, it's a bit baffling to me why so many people seem to think that floating-point is disastrous.

My best guess for the latter proposition is that people are reacting to the default float printing logic of languages like Java, which display a float as the shortest base-10 number that would correctly round to that value, which extremely exaggerates the effect of being off by a few ULPs. By contrast, C-style printf specifies the number of decimal digits to round to, so all the numbers that are off by a few ULPs are still correct.

[1] I'm not entirely sure about the COBOL mainframe applications, given that COBOL itself predates binary floating-point. I know that modern COBOL does have some support for IEEE 754, but that tells me very little about what the applications running around in COBOL do with it.

munch117 · 23h ago

The answer is accounting. In accounting you want predictability and reproducibility more than anything, and you are prepared to throw away precision on that alter.

If you're summing up the cost of items in a webshop, then you're in the domain of accounting. If the result appears to be off by a single cent because of a rounding subtlety, then you're in trouble, because even though no one should care about that single cent, it will give the appearance that you don't know what you're doing. Not to mention the trouble you could get in for computing taxes wrong.

If, on the other hand, you're doing financial forecasting or computing stock price targets, then you're not in the domain of accounting, and using floating point for money is just fine.

I'm guessing from your post that your finance people are more like the latter. I could be wrong though - accountants do tend to use Excel.

jcranmer · 22h ago

To get the right answers for accounting, all you have to do is pay attention to how you're doing rounding, which is no harder for floating-point than it is for fixed-point. Actually, it might be slightly easier for floating-point, since you're probably not as likely to skip over the part of the contract that tells you what the rounding rules you have to follow are.

munch117 · 21h ago

Agreed. To do accounting, you need to employ some kind of discipline to ensure that you get rounding right. So many people erroneously believe that such a discipline has to be based on fixed point or decimal floating point numbers. But binary floating point can work just fine.

pgwhalen · 23h ago

I agree overall but my take is that it shows more ignorance about the domain of finance (or a particular subdomain) than it does about floating-point ignorance.

It’s really more of a concern in accounting, when monetary amounts are concrete and represent real money movement between distinct parties. A ton of financial software systems (HFT, trading in general) deal with money in a more abstract way in most of their code, and the particular kinds of imprecision that FP introduces doesn’t result in bad business outcomes that outweigh its convenience and other benefits.

munch117 · 22h ago

FP does not introduce imprecision. Quite the contrary: The continuous rounding (or truncation) triggered by using scaled integers is what introduces imprecision. Whereas exponent scaling in floating point ensures that all the bits in the mantissa are put to good use.

It's a trade-off between precision and predictability. Floating point provides the former. Scaled integers provide the latter.

pgwhalen · 21h ago

I was using imprecision in a more general and less mathematical sense than the way you’re interpreting it, but yes this is a good point about why FP is useful in many financial contexts, when the monetary amount is derived from some model.

osigurdson · 1d ago

Wouldn't it be better to use a decimal type?

MobiusHorizons · 1d ago

This is what’s called a fixed point decimal type. If you need variable precision, then a decimal type might be a good idea, but fixed point removes a lot of potential foot guns if the constraints work for you.

osigurdson · 23h ago

I meant fixed point decimal type (like C#) 128 bit. I don't understand why the parent commenter (top voted comment?) used unsigned integers to track individual cents. Why roll your own decimal type?

Using arbitrary precision doesn't make sense if the data needs to be stored in a database (for most situations at least). Regardless, infinite precision is magical thinking anyway: try adding Pi to your bank account without loss of precision.

MobiusHorizons · 22h ago

the C# decimal type is not fixed point, its a floating point implementation, but just uses a base 10 exponent instead of a base 2 one like IEE754 floats.

Fixed point is a general technique that is commonly done with machine integers when the necessary precision is known at compile time. It is frequently used on embedded devices that don't have a floating point unit to avoid slow software based floating point implementations. Limiting the precision to $0.01 makes sense if you only do addition or subtraction. Precision of $0.001 (Tenths of a cent also called mils) may be necessary when calculating taxes or applying other percentages although this is typically called out in the relevant laws or regulations.

osigurdson · 22h ago

Good to know. In a scientific domain so haven't used it previously.

MobiusHorizons · 17h ago

Fun fact there is a decimal type on some hardware. I believe Power PC, and presumably mainframes. You can actually use it from C although it’s a software implementation on most hardware. IEEE754-2008 if you are curious.

jjmarr · 22h ago

IEEE754 defines a floating point decimal type. What are your opinions on that?

MobiusHorizons · 17h ago

It’s very cool, but not present on most hardware. Fixed point is a lot simpler though if you are dealing with something with inherent granularity like currency

layer8 · 1d ago

Wait until you learn that Excel calculates everything using floating-point, and doesn't even fully observe IEEE 754.

https://learn.microsoft.com/en-us/office/troubleshoot/excel/...

(It nevertheless happens to work just fine for most of what Excel is used for.)

nurettin · 1d ago

I inherited systems that trade real world money using f64. They work surprisingly well, and the errors and bugs are almost never due to rounding. Those that are also have easy fixes. So I'm always baffled by this "expert opinion" of using integers for cents. It is pretty much up there with "never use python pickle it is unsafe" and "never use http, even if the program will never leave the subnet".

dataangel · 1d ago

you can't accurately represent 10 cents with floats, 0.1 is not directly representable. same with 1 cent, 0.01. Seems like if you do and significant math on prices you should run into rounding issues pretty quickly?

adgjlsfhk1 · 1d ago

no. Float64 has 16 digits of precision. Therefore even if you're dealing with trillions of dollars, you have accuracy down to the thousandth of a cent.

cstrahan · 19h ago

You might want to re-study this topic.

The decimal number 0.1 has an infinitely repeating binary fraction.

Consider how 1/3 in decimal is 0.33333… If you truncate that to some finite prefix, you no longer have 1/3. Now let’s suppose we know, in some context, that we’ll only ever have a finite number of digits — let’s say 5 digits after the decimal point. Then, if someone asks “what fraction is equivalent to 0.33333?”, then it is reasonable to reply with “1/3”. That might sound like we’re lying, but remember that we agreed that, in this context of discussion, we have a finite number of digits — so the value 1/3 outside of this context has no way of being represented faithfully inside this context, so we can only assume that the person is asking about the nearest approximation of “1/3 as it means outside this context”. If the person asking feels lied to, that’s on them for not keeping the base assumptions straight.

So back to floating point, and the case of 0.1 represented as 64 bit floating point number. In base 2, the decimal number 0.1 looks like 0.0001100110011… (the 0011 being repeated infinitely). But we don’t have an infinite number of digits. The finite truncation of that is the closest we can get to the decimal number 0.1, and by the same rationale as earlier (where I said that equating 1/3 with 0.33333 is reasonable), your programming language will likely parse “0.1” as a f64 and print it back out as such. However, if you try something like (a=0.1; a+a+a) you’ll likely be surprised at what you find.

adgjlsfhk1 · 18h ago

> you’ll likely be surprised at what you find.

I very much doubt it. My day job is writing symbolic-numeric code. The result of 0.1+0.1+0.1 != 0.3, but for rounding to bring it up to 0.31 (i.e. rounding causing an error of 1 cent), you would need to accumulate at least .005 error, which will not happen unless you lose 13 out of your 16 digits of precision, which will not happen unless you do something incredibly stupid.

nulld3v · 1d ago

I'm curious where you got this idea from because it is trivially disprovable by typing 0.1 or 0.01 into any python or JS REPL?

chowells · 23h ago

Do you believe that the way the REPL prints a number is the way it's stored internally? If so, explaining this will be a fun exercise:

    $ python3
    Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> a = 0.1
    >>> a + a + a
    0.30000000000000004

By way of explanation, the algorithm used to render a floating point number to text used in most languages these days is to find the shortest string representation that will parse back to an identical bit pattern. This has the direct effect of causing a REPL to print what you typed in. (Well, within certain ranges of "reasonable" inputs.) But this doesn't mean that the language stores what you typed in - just an approximation of it.

nulld3v · 22h ago

:facepalm: my bad, I completely missed the more rational intepretation of OP's comment...

I interpreted "directly representable" as "uniquely representable", all < 15 digit decimals are uniquely represented in fp64 so it is always safe to roundtrip between those decimals <-> f64, though indeed this guarantee is lost once you perform any math.

anthk · 22h ago

Oddly, tcl prints 0.30000000000000004 while jimtcl prints 0.3, while with 1/7 both crap out and round it to a simple 0.

Edit: Now it does it fine after inputting floats:

puts [ expr { 1.0/7.0 } ]

Eforth on top of Subleq, a very small and dumb virtual machine:

     1 f 7 f f/ f.
     0.143 ok

Still, using rationals where possible (and mod operations otherwise) gives a great 'precision', except for irrationals.

krapht · 1d ago

https://docs.python.org/3/tutorial/floatingpoint.html

Stop at any finite number of bits, and you get an approximation. On most machines today, floats are approximated using a binary fraction with the numerator using the first 53 bits starting with the most significant bit and with the denominator as a power of two. In the case of 1/10, the binary fraction is 3602879701896397 / 2 * 55 which is close to but not exactly equal to the true value of 1/10.

Many users are not aware of the approximation because of the way values are displayed. Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine. On most machines, if Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display:

  0.1
  0.1000000000000000055511151231257827021181583404541015625

That is more digits than most people find useful, so Python keeps the number of digits manageable by displaying a rounded value instead:

  1 / 10
  0.1

That being said, double should be fine unless you're aggregating trillions of low cost transactions. (API calls?)

Izkata · 23h ago

For anyone curious about testing it themselves and/or wanting to try other numbers:

  >>> from decimal import Decimal
  >>> Decimal(0.1)
  Decimal('0.1000000000000000055511151231257827021181583404541015625')

CamperBob2 · 22h ago

At the end of a long chain of calculations you're going to round to the nearest 0.01. It will be a LONG time before errors caused by double-precision floats cause you to gain or lose a penny.

SonOfLilit · 1d ago

You can make money modeling buy/sell decisions in floats and then having the bank execute them, but if the bank models your account as a float and loses a cent here and there, it will be sued into bankruptcy.

jcranmer · 1d ago

A double-precision float has ~16 decimal digits of precision. Which means as long as your bank account is less than a quadrillion dollars, it can accurately store the balance to the nearest cent.

kccqzy · 1d ago

You will not lose a cent here and there just by using float64, for the range of values that banks deal with. For added assurance, just round to the nearest cent after each operation.

kbolino · 12h ago

You can round to the nearest .0078125 (1/128) but not to the nearest .01 (1/100) because that number cannot be represented in float64.

To round to the nearest cent, you would need to make cents your units (i.e. the quantity "1 dollar" would be represented as 100 instead of 1.0).

nurettin · 11h ago

Within the general context of this discussion, "cannot be represented" is a red herring.

You don't need to have a representation of the exact number 0.1 if you can tolerate errors after the 7th decimal (and it turns out you can). And 0.1+0.1+0.1 does not have to be comparable with 0.3 using operator==. You have an is_close function for that. And accumulation is not an issue because you have rounding and fma for that.

kbolino · 3h ago

Ok, but in the context of this thread, it is important. Remember that I'm replying to the statement "For added assurance, just round to the nearest cent after each operation." That is misleading advice and the behavior of floats is just context for why.

First of all, a lot of languages don't include arbitrary rounding in their math libraries at all, only having rounding to integers. Second, in the docs of Python, which does have arbitrary rounding, it specifically says:

    Note: The behavior of round() for floats can be surprising: for example, round(2.675, 2) gives 2.67 instead of the expected 2.68. [...]

Thus I think what I said stands: you cannot round to nearest cent reliably all the time, assuming cent means 0.01. The only rounding you can sort of trust is display rounding because it actually happens after converting to base 10. It's why 2.675 will print as 2.675 in Python even though it won't round as you'd expect. But you'd only do that once at the end of a chain of operations.

In a lot of cases, errors like these don't matter, but the key point is that if the errors don't matter, then they don't need to be "assured" away by dubious rounding either.

nurettin · 1h ago

floats are also a red herring. Maybe you can continue with someone else.

rcleveng · 1d ago

Wrappers are good even when non dealing with multiple currencies since in many places some transactions are in fractions of cents, so depending on the usecase may need to push that decimal a few places out.

I always have a wrapper class to put the logic of converting to whole currency units when and if needed, as well as when requirements change and now you need 4 digits past the decimal instead of 2, etc.

pie_flavor · 1d ago

One of the things I always appreciate about the crypto community is that you do not have to ask what numeric type is being used for money, it is always 8-digit fixed-point. No floating-point rounding errors to be found anywhere.

Athas · 22h ago

How does this avoid rounding error? Division and multiplication and still result in nonrepresentable numbers, right?

immibis · 1d ago

Correction: Bitcoin is 8-digit fixed-point. But Lightning is 10, IIRC. Other currencies have different conventions. Still, it's fixed within a given system and always fixed-point. As far as I'm aware, there are no floating-point cryptocurrencies at all, because it would be an obvious exploit vector - keep withdrawing 0.000000001 units from your account that has 1.0 units.

knert · 1d ago

How do you store negative numbers?

psychoslave · 1d ago

Maybe as in accounting, one column for benefits, one for debts?

MobiusHorizons · 1d ago

You use a signed integer type, so you just store a negative number.

You can think of fixed point as equivalent to ieee754 floats with a fixed exponent and a two’s complement mantissa instead of a sign bit.

sholladay · 1d ago

Correctness > performance, almost always. It’s easier to notice that you need more performance than to notice that you need more correctness. Though performance outliers can definitely be a hidden problem that will bite you.

Make it work. Make it right. Make it fast.

mg794613 · 22h ago

Haha, the neverending cycle.

Stop trying. Let their story unfold. Let the pain commence.

Wait 30 years and see them being frustrated trying to tell the next generation.

rlpb · 1d ago

> I mean, the whole point of fast-math is trading off speed with correctness. If fast-math was to give always the correct results, it wouldn’t be fast-math, it would be the standard way of doing math.

A similar warning applies to -O3. If an optimization in -O3 were to reliably always give better results, it wouldn't be in -O3; it'd be in -O2. So blindly compiling with -O3 also doesn't seem like a great idea.

CamouflagedKiwi · 1d ago

The optimisations in -O3 aren't supposed to give incorrect results. They're not in -O2 because they make a more aggressive space/speed tradeoff or increase compile times more significantly. In the same way, the optimisations in -O2 are not meant to be less correct than -O1, but they aren't in that group for similar reasons.

-Ofast is the 'dangerous' one. (It includes -ffast-math).

rlpb · 1d ago

> The optimisations in -O3 aren't supposed to give incorrect results.

I didn't mean to imply that they result in incorrect results.

> they make a more aggressive space/speed tradeoff...

Right...so "better" becomes subjective, depends on the use case, so it doesn't make sense to choose -O3 blindly unless you understand the trade-offs and want that side of them for the particular builds you're doing. Things that everyone wants would be in -O2. That's all I'm saying.

eqvinox · 1d ago

It doesn't become subjective; things in -O3 can objectively be understood to produce equal or faster code for a higher build cost in the vast majority of cases, roughly averaged across platforms. (Without loss in correctness.)

If you know your exact target and details about your input expectations, of course you can optimize further, which might involve turning off some things in -O3 (or even -O2). On a whole bunch of systems, -Os can be faster than -O3 due to I-cache size limits. But at-large, you can expect -O3 to be faster.

Similar considerations apply for LTO and PGO. LTO is commonly default for release builds these days, it just costs a whole lot of compile time. PGO is done when possible (i.e. known majority inputs).

CamouflagedKiwi · 18h ago

If they're things that everyone wants, why aren't they in -O1?

wffurr · 1d ago

If the answer can be wrong, you can make it as fast as you want.

zzo38computer · 17h ago

It depend what kind of wrong answers are acceptable (and in what circumstances).

Understanding Inequality, Part I – Paul Krugman (paulkrugman.substack.com)

Show HN: 7% more compression, 34% faster than ZSTD with CXcompress+ZSTD (github.com)

Type of Fiber Could Have Weight Loss Benefits Similar to Ozempic (sciencealert.com)

The Coming Age of Wisdom Work (every.to)

Logical Fallacies (owl.purdue.edu)

The AI Question: How Do We Build Systems That Require Less Code? (alonso.network)

Solar Storms Are Pushing Elon Musk's Satellites Back to Earth (gizmodo.com)

Real Estate AI App looking for feedback

Print CHR$(205.5+RND(1));: Goto 10 (10print.org)

Show HN: Mijick Pockets – All project resources in one place (mijick.com)

Builder.ai Faked Business with Indian Firm VerSe to Inflate Sales (bloomberg.com)

How Dot Files Became Hidden Files (glenda.0x46.net)

Electronics from Peanut Shells (physics.aps.org)

Boxtype – A Puzzle Game (inconvergent.net)

Arcade Rhythm Games (cadence.moe)

Hacked (sheep.horse)

Darwin Gödel Machine (gonzoml.substack.com)

The Year Without Summer (branchcollective.org)

Moving Forth (bradrodriguez.com)

What a CEO Does (2010) (avc.com)

EIDAS 2: the countdown to a single European Digital ID Wallet has begun (thalesgroup.com)

The hottest new vibe coding startup may be a sitting duck for hackers (semafor.com)

AnyCrawl.dev – High-Performance Web Scraping Platform Built for the AI Era (anycrawl.dev)

Immediate Mode Option Parser: Small, Simple, Elegant (nrk.neocities.org)

The U.S. Deported This Chinese Scientist, in a Decision That Changed Worl (Cont) (nytimes.com)

Taylor Swift has regained control of her music, buys back first 6 albums (apnews.com)

Wind energy more expensive, less stable and less secure (brusselssignal.eu)

What We Lost with PHP and jQuery (idiallo.com)

The Theory of Everything That Nobody Talks About [video] (youtube.com)

Duolingo grapples with its 'AI-first' promise before an angry social mob (thenewstack.io)

Show HN: LLM in Godot 4 (youtube.com)

How does GEO (SEO for AI) work? (apimagic.ai)

European Lisp Symposium (zenodo.org)

Why 90% of great products fail at customer discovery (smarketly.lema-lema.com)

Can We Afford Large-Scale Solar PV? – By Brian Potter (construction-physics.com)

Petabyte-Class E2 SSDs Poised to Disrupt Warm Data Storage – Storagereview.com (storagereview.com)

Security Through Observability. Lightweight Agent, Powered by AI (sentrilite.com)

Intel: Stumbling in the Spotlight (abortretry.fail)

Scapple (literatureandlatte.com)

People Spend Too Much Time on Decisions with Equally Satisfying Outcomes (robkhenderson.com)

LICEcap (cockos.com)

Ukraine destroys more than 40 military aircraft in drone attack deep in Russia (npr.org)

The White House Vision for Dismantling Science (joshuasweitz.substack.com)

Equivariance is dead, long live equivariance? (chaitjo.substack.com)

Show HN: TestPanel, AI studying app for adult learners (testpanel.ai)

LLM Visualization (bbycroft.net)

Show HN: Reactylon – React framework to build 3D/XR experiences (github.com)

How can you find unused functions in Python code? (stackoverflow.com)

DoorDash CEO Xu is taking on the role of industry consolidator in food delivery (cnbc.com)

The Big Ugly Old Thing (rsx11.blogspot.com)

Beware of Fast-Math

Comments (212)