We're only basically right.

Posted by J. Robert Michael PhD on Fri 15 November 2019 Updated on Fri 15 November 2019

Sometimes basically is ok. Sometimes basically needs more explanation.

"pi is 3.1415926535 ... basically"
"Gravity is -9.8 m/s^2 ... basically"
"I love you ... basically"
"The earth is round ... basically"
"You're going to live ... basically"

When we write scientific software we are always getting the right answers ... basically. Machine representation of an arbitrary Real number is only approximate. Even worse, the answers we get when we apply simple arithmetics are also approximate.

1. Incorrect comparisons

The following Matlab code is a scary example of how computers don't work the way we think they do when comparing two numbers. 1.1 + 1.01 is obviously 2.11, but in a computer 1.1 + 1.01 is only basically 2.11:

if( 1.1 + 1.01 > 2.11)

If you don't have Matlab, try Octave. If you don't have Octave, try C. If you don't have C, try Fortran. I haven't tested this in any other language, but I'm pretty sure the same thing will happen: your computer will cry out for help because it's getting the wrong answers!

2. A change of parentheses

Let's stay on the Matlab / Octave train here for a minute. Consider the following input:

>>> (-1 + 1) + 1e-17
Ans = 1.0000e-17

Brilliant - computers understand the basics of addition! This is a good thing since we use them on a daily basis, not to mention that computers are often used for things like managing the stock market and launching nukes. Can you imagine what might happen if computers didn't know how perform simple arithmetics? Let's try one more example just so we feel completely secure that our computer knows how to add:

>>> -1 + (1 + 1e-17)
Ans = 0

Huh ... that was weird. Isn't addition associative?

3. A change of direction

If our computers don't know how to add 3 numbers correctly, what happens when we try to add more? The code for this example is a bit longer (14 lines to be exact), but it's worth looking at:

program compare
     implicit none
     real :: fsum = 0.0, bsum = 0.0
     integer :: i, n = 100000000
     do i = 1,n
         fsum = fsum + 1.0/real(i)
     do i = n,1,-1
         bsum = bsum + 1.0/real(i)
     print*, 'fsum = ', fsum
     print*, 'bsum = ', bsum
     print*, 'diff = ', bsum - fsum
end program compare

If you aren't fluent in Fortran then you likely don't have a compiler handy and might not know what the code is doing. It's pretty simple, actually. fsum is simply 1/1 + 1/2 + 1/3 + ... + 1/n, and bsum is just 1/n + 1/(n-1) + 1/(n-2) + ... + 1/2 + 1/1. If you're only halfway reading this, you should note that those two sums are analytically identical.

On my local desktop, however, the output is as follows:

fsum = 15.4036827
bsum = 18.8079185
diff = 3.40423584

That's not a small difference. The error here is almost on the same order of magnitude as the answer! Where else could errors like this be hiding? What if this kind of thing snuck into some kind of missile defense system?

What's happening?

Computers always compute correct floating point operations ... basically. There is a thing called machine epsilon which basically says that every single binary operation (+, -, *, /) can have a small amount of error in it and still be considered "correct". The problem is when we work with things that are only basically correct and treat them as if they are exactly correct (example 1) or if we do millions of things which are only basically correct and the large amount of small compromises add up (example 3).

What's next?

Data is growing exponentially. I always get annoyed by people using words like "exponentially" and "literally" incorrectly. I once heard a reporter on CNN say that the room was filling with people exponentially. No it's not - if it was then those people would run out of Oxygen in a matter of minutes.

With that being said, I believe that data is literally growing exponentially. With the massive amounts of incoming data (which will only get worse over time), these kinds of small errors and little compromises can really start to add up in unpredictable ways. This begs the question: Is your computational analysis correct or is it basically correct?

NOTE: I originally posted this Jan. 21st, 2016 on LinkedIn