Python: Floating point approximations

A common programming fallacy taught to beginning students is to use floating point variables when it is not appropriate.

An exact mathematical solution involving real numbers (which are not really real, they are assumed to be real) is only an approximation.

Any floating point number (e.g., that represent real numbers) has an inherent rounding error.

More: Exact math approximates reality

Note: We are here ignoring discrete mathematics that involve only integers.

A slide rule (not slide ruler) was once used to do manual computations.

Whenever a slide rule is used, it is very evident that any computation involving real numbers is an approximation.

What are the values of the following (using a computer program)?

1/10 1/10 + 1/10 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10 + 1/10

Note: These are rational numbers, not irrational like √2 or transcendental numbers like π or e.

Here is the mathematics result expressed as real (rational) numbers.

1/10 = 0.1 2/10 = 0.2 3/10 = 0.3 4/10 = 0.4 5/10 = 0.5 6/10 = 0.6 7/10 = 0.7 8/10 = 0.8 9/10 = 0.9 10/10 = 1.0

Here is a simple Lua program to add values of 1/10 as 0.1.

sum1 = 0.0
for count1=1,10 do
	sum1 = sum1 + 0.1
	print(string.format("%2d/10 = %1.17f",count1,sum1))
	end

The 17 places in the output is important to being exactly precise.

Here is the result from a computation point of view as the output of the above program.

 1/10 = 0.10000000000000001
 2/10 = 0.20000000000000001
 3/10 = 0.30000000000000004
 4/10 = 0.40000000000000002
 5/10 = 0.50000000000000000
 6/10 = 0.59999999999999998
 7/10 = 0.69999999999999996
 8/10 = 0.79999999999999993
 9/10 = 0.89999999999999991
10/10 = 0.99999999999999989

Even when symbolic math can solve a problem, any attempt to compute real answers involves the same approximations.

The difference is small at each step but the difference can add up to a significant amount over many iterations.

Note: One can fix this specific instance by using a fixed decimal notation, but that only works, in base 10, for numbers and increments that are multiples of 2 or 5, the prime factors of 10.

Rounding errors need to be addressed in fields of computer science such as numerical analysis.

James Gleick (American author and historical scientist) has written a very interesting book on the field known as "chaos theory" - a sensitive dependency on initial conditions. The field was accidentally discovered by the young French mathematician Henri Poincaré while attempting to find an exact mathematical solution for the three body problem.

Gleick, J. (1988). Chaos: making a new science. New York: Penguin Books..

More: James Gleick

Exact mathematics can solve the (idealized) two body problem such as the sun and earth or the earth and moon.

An exact solution for the (idealized) three body problem such as the sun, earth and moon has not been found.

Accurate weather prediction requires solving the (idealized) almost infinite particle system.

More: Two and three body problem

Quantum computing:

Dunbar number

much faster than conventional computers
best for problems that allow probabilistic solutions
cannot solve all problems - despite the hype

[exponential speedup not clearly defined, like entanglement]qc-11

Quantum computing analogies: pick the best way

walk on foot: pencil and paper
drive by car : conventional computer (go most anywhere)
fly by jet : quantum computers (does not go anywhere, sometimes impractical)
no way to get to Mars, nearest star, etc.

More: Quantum computing

Fallacy: Dollars and cents should be represented using floating point variables.
The general rule, sometimes taught, that if the number has a decimal point, it should be represented as a floating point variable does not hold for dollars and cents.

The amount $123.56 is not a floating point value. It an integer value. That is 12345¢ cents.

Rule: Convert dollars and cents to cents, do the arithmetic, then convert back into dollars and cents (for display purposes). This distinction needs to be carefully handled in languages such as JavaScript and Lua which do not have integer variables.

The approximation issue involves floating point division so every such division needs to result in an integer value.

Fallacy: Social security numbers and phone numbers should be represented using integers.

How should social security numbers and phone numbers be represented? They appear to be numbers. That is, integers.

Are you ever going to add, subtract, multiply or divide these values? If not, use text.

SSN: 999-99-9999 (1 billion values, originally geographically located)

Ask your self the following question.

Are you ever going to add, subtract, multiply or divide these values?

If not, represent them using text. Note:

Leading zeros are lost using integers.
If an ordering is present, care must be taken when sorting lists of such values.

A floating point representation approximates the mathematical real numbers.

In lua, a floating point approximation is called a float number. There are two primary ways to represent real number literal approximations.

A decimal notation, with an integer and fractional part separated by a decimal point, such as 1234.567
An exponential notation, also with an integer and fractional part, but followed by a scale factor (representing the power of 10). For the scientific (mathematical) notation 1.234567 x 10², the exponential notation (real approximation) is 1.234567E+02.

It is good practice to always include a decimal point when expressing a real number approximation and to write nonempty integer and fractional parts. So, write 1.0 instead of 1. or 1, and write 0.1 instead of .1. Floating point numbers are only an approximation to the mathematical real numbers. Consider the real (rational) number 2/3. In base ten, this is represented as

0.666666...

Note: The mathematical repetend notation is a finite representation of an infinite object. At some point, a computer (unless it is directly representing rational numbers as numerator and denominator) must round the stored value. This introduces a small roundoff, or truncation, error. Someone once said that real numbers are a lot like sand piles. Every time you move one, you lose a little sand and you pick up a little dirt. So on most computers, when you write 0.1, you will not get an exact representation of the mathematical 1/10, but something like

0.0999999999...

This is because computers, for efficiency, usually represent numbers in base two, and 1/10 cannot be exactly represented without roundoff error.

It is good programming practice not to mix integer and real arithmetic when writing arithmetic expressions. In all such expressions, make explicit conversions.

When should you use integers and when should you use reals? When something can be counted, you should represent that value with an integer. People can be counted; we do not speak of 0.5 of a person. Whenever something cannot be counted, but can be measured, you should represent that value with a real number approximation. It is not reasonable to count grains of sand or molecules of water, so sand piles and water should be measured (approximated as a real number). As soon as the water is put into gallon containers to be sold, the containers containing the measured water can be counted. Statistical results are measures that are used for approximation and making decisions, and should be approximated by real numbers. How would you represent the following - count or measure?

amount of dirt in a dump-truck
number of loads of dirt removed each day
dollars and cents of the federal budget
size of the average family

The following are standard arithmetic operators for real number approximations.

Addition using binary infix arithmetic operator +
Subtraction using binary infix arithmetic operator -
Multiplication using binary infix arithmetic operator *
Division using binary infix arithmetic operator /

These operations work in the same manner as integer arithmetic, except that real division is not the same as integer quotient and remainder (real division includes the decimal point and fractional part; the remainder is not defined).

When many languages added a floating point approximation data type, it was, as in C, called float - for floating point approximation

Later, additional precision was added. Since it was a double precision floating point approximation, the data type was, as in C, called double.

Never use a float data type unless you have a compelling reason.

Whenever working with dollars and cents, never use a double. When forced to use a double (as in JavaScript) be very careful.

In general, the following approach can be used.

Input: Get dollars and cents and convert everything to cents.
Process: Do all processing in terms of cents, not dollars and cents.
Output: Convert cents to dollars and cents.

Never compare two double floating approximations for equality or inequality.

If you are using a double as an integer, this can work. But in general, such comparisons can cause undesired effects.

Example: Using assert for equality of double values in, say, a CS 101 programming class using C.

Mathematics can solve the 2-body problem exactly.
Mathematics cannot solve the 3-body problem (source of chaos theory)

Exact math solutions are only an approximation of reality - due to floating point approximations.. That is, finite approximations of (potentially) infinite objects.

Here is the Python code [#1]

i1 = 0
n1 = 10
d1 = 0.0
while i1 != n1:
	d1 += 0.1
	i1 += 1
	print("i1={0:d} d1=[{1:30.28f}]".format(i1,d1))

Here is the output of the Python code.

i1=1 d1=[0.1000000000000000055511151231]
i1=2 d1=[0.2000000000000000111022302463]
i1=3 d1=[0.3000000000000000444089209850]
i1=4 d1=[0.4000000000000000222044604925]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.5999999999999999777955395075]
i1=7 d1=[0.6999999999999999555910790150]
i1=8 d1=[0.7999999999999999333866185225]
i1=9 d1=[0.8999999999999999111821580300]
i1=10 d1=[0.9999999999999998889776975375]

Here is the Java code [#1]

public class _01 {
	public _01() {
		int i1 = 0;
		int n1 = 10;
		double d1 = 0.0;
		while (i1 != n1) {
			d1 += 0.1;
			i1++;
			System.out.print(String.format("i1=%d d1=[%30.28f]\n",i1,d1));
			}
		}
	public static void main(String [] args) {
		new _01();
		}
	}

Here is the output of the Java code.

i1=1 d1=[0.1000000000000000000000000000]
i1=2 d1=[0.2000000000000000000000000000]
i1=3 d1=[0.3000000000000000400000000000]
i1=4 d1=[0.4000000000000000000000000000]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.6000000000000000000000000000]
i1=7 d1=[0.7000000000000000000000000000]
i1=8 d1=[0.7999999999999999000000000000]
i1=9 d1=[0.8999999999999999000000000000]
i1=10 d1=[0.9999999999999999000000000000]

Here is the C code [#1]

#include <stdio.h>
int main() {
	int i1 = 0;
	int n1 = 10;
	double d1 = 0.0;
	while (i1 != n1) {
		d1 += 0.1;
		i1++;
		printf("i1=%d d1=[%30.28lf]\n",i1,d1);
		}
	return 0;
	}

Here is the output of the C code.

i1=1 d1=[0.1000000000000000055511151231]
i1=2 d1=[0.2000000000000000111022302463]
i1=3 d1=[0.3000000000000000444089209850]
i1=4 d1=[0.4000000000000000222044604925]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.5999999999999999777955395075]
i1=7 d1=[0.6999999999999999555910790150]
i1=8 d1=[0.7999999999999999333866185225]
i1=9 d1=[0.8999999999999999111821580300]
i1=10 d1=[0.9999999999999998889776975375]

Here is the C# code [#1]

using System;
using System.IO;
namespace _01 {
	class _01 {
		public _01() {
			int i1 = 0;
			int n1 = 10;
			double d1 = 0.0;
			while (i1 != n1) {
				d1 += 0.1;
				i1++;
				Console.WriteLine(String.Format("i1={0:d} d1=[{1:0.0000000000000000000000000000}]",i1,d1));
				}
			}
		static void Main(string[] args) {
			new _01();
			}
		}
	}

Here is the output of the C# code.

i1=1 d1=[0.1000000000000000000000000000]
i1=2 d1=[0.2000000000000000000000000000]
i1=3 d1=[0.3000000000000000000000000000]
i1=4 d1=[0.4000000000000000000000000000]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.6000000000000000000000000000]
i1=7 d1=[0.7000000000000000000000000000]
i1=8 d1=[0.8000000000000000000000000000]
i1=9 d1=[0.9000000000000000000000000000]
i1=10 d1=[1.0000000000000000000000000000]

Here is the Go code [#1]

package main
import "fmt"
func main() {
	var i1 int = 0
	var n1 int = 10
	var d1 float64 = 0.0
	for i1 != n1 {
		d1 += 0.1;
		i1 += 1
		fmt.Printf(fmt.Sprintf("i1=%d d1=[%30.28f]\n",i1,d1))
		}
	}

Here is the output of the Go code.

Here is the Lua code [#2]

i1 = 0
n1 = 10
d1 = 0.0
while i1 ~= n1 do
	d1 = d1 + 0.1
	i1 = i1 + 1
	print(string.format("i1=%d d1=[%30.28f]",i1,d1))
	end

Here is the output of the Lua code.

i1=1 d1=[0.1000000000000000100000000000]
i1=2 d1=[0.2000000000000000100000000000]
i1=3 d1=[0.3000000000000000400000000000]
i1=4 d1=[0.4000000000000000200000000000]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.5999999999999999800000000000]
i1=7 d1=[0.6999999999999999600000000000]
i1=8 d1=[0.7999999999999999300000000000]
i1=9 d1=[0.8999999999999999100000000000]
i1=10 d1=[0.9999999999999998900000000000]

Here is the PHP code [#1]

<?php
$i1 = 0;
$n1 = 10;
$d1 = 0.0;
while ($i1 !== $n1) {
	$d1 += 0.1;
	$i1++;
	echo(sprintf("i1=%d d1=[%30.28f]\n",$i1,$d1));
	}
?>

Here is the output of the PHP code.

i1=1 d1=[0.1000000000000000055511151231]
i1=2 d1=[0.2000000000000000111022302463]
i1=3 d1=[0.3000000000000000444089209850]
i1=4 d1=[0.4000000000000000222044604925]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.5999999999999999777955395075]
i1=7 d1=[0.6999999999999999555910790150]
i1=8 d1=[0.7999999999999999333866185225]
i1=9 d1=[0.8999999999999999111821580300]
i1=10 d1=[0.9999999999999998889776975375]

Here is the R code [#1]

i1 <- 0
n1 <- 10
d1 <- 0.0
while (i1 != n1) {
	d1 <- d1 + 0.1
	i1 <- i1 + 1
	cat(sprintf("i1=%d d1=[%30.28f]\n",i1,d1))
	}

Here is the output of the R code.

i1=1 d1=[0.1000000000000000055511151231]
i1=2 d1=[0.2000000000000000111022302463]
i1=3 d1=[0.3000000000000000444089209850]
i1=4 d1=[0.4000000000000000222044604925]
i1=5 d1=[0.5000000000000000000000000000]
i1=6 d1=[0.5999999999999999777955395075]
i1=7 d1=[0.6999999999999999555910790150]
i1=8 d1=[0.7999999999999999333866185225]
i1=9 d1=[0.8999999999999999111821580300]
i1=10 d1=[0.9999999999999998889776975375]

Macro code notation
Language specification (20+ languages)
Formatting process - takes input from document, compiles and runs code, gets output and errors, puts specified text into document

Be very careful about making any assertions about floating point numbers when division is involved. Do NOT use floating point for dollars and cents! Use integer arithmetic (in cents), converting dollars and cents as needed.

Care is needed in JavaScript, Lua, etc., as the only numeric data type is floating point. In practice, to compare double values for equality or inequality, one picks a very small value such that if the value is within this range, the values are considered equal (or unequal if out of this range).