C# Decimal and Java BigDecimal Solve Roundoff Problems

Save/Share Google Yahoo! Digg It Reddit del.icio.us
My Zimbio

Roundoff problems have been the bane of programmers since computers started handling floating point numbers. Floating point numbers are represented by a finite number of bytes. This limits the precision of numbers that may be represented – you only get so many significant digits. More subtly, however, finite length floating point numbers limit which numbers may be represented accurately. Repeating decimals (which, in a sense, require an infinite number of significant digits) cannot be represented precisely. You may only store the nearest value representable by the computer. This is the origin of annoying roundoff problems found in floating point arithmetic.

The C# and Java 4-byte float data type provides 7 significant digits of precision, while the 8-byte double type provides 15. Using the higher precision double data type helps minimize the roundoff error, but it must still be addressed.

So, when I encountered the C# 12-byte decimal data type with its 28 significant digits, and Java’s arbitrary size BigDecimal data type, I figured these were simply ways of further minimizing the problem with a wider floating point type. Only after digging deeper did I realize that the implementation behind both of these data types is a stroke of genius.

The need for precision

In scientific applications, inexact representation is a non-issue. Whether you’re measuring the temperature on the Sun or the diameter of an atom, things like numeric magnitude matter more than accuracy out to the 17th decimal point. In graphical applications, you’re probably rounding off to the nearest pixel, anyway. In general, computers provide floating point precision that greatly exceeds any practical need.

You get into trouble, however, with financial applications. Even when you’re dealing with million dollar transactions, your debits and credits must match exactly. Off by a penny? You’ll drive the accountants nuts!

Consider this C# example. Below, pretend we’ve got a bank account with $100,000 and we withdraw $99,999.67. How much money is left? $0.33, right? Let’s see what C# calculates:

using System;

namespace FloatingPoint

{

public class SubtractionRoundoff

{

static void Main()

{

float floatValue = 100000;

double doubleValue = 100000f;

decimal decimalValue = 100000m;

floatValue -= 99999.67f;

doubleValue -= 99999.67;

decimalValue -= 99999.67m;

Console.WriteLine(“float result = ” + floatValue);

Console.WriteLine(“double result = ” + doubleValue);

Console.WriteLine(“decimal result = ” + decimalValue);

}

}

}

If you’ve ever written floating point code, you can probably predict the first two lines:

float result = 0.328125
double result = 0.330000000001746
decimal result = 0.33

Both the float and double calculations have roundoff error, even though both perform calculations within the limits of their significant digits. The integer value 100,000 can be represented precisely, but neither float nor double can precisely represent the fractional part of 99,999.67. “.67” would require an infinitely repeating series of bytes. C# and Java must truncate the bytes to the width of the data-type. This representational error then appears after the subtraction operation.

But, how come the decimal type nailed it – exactly? Shouldn’t the decimal type still have some error, albeit with far less magnitude than the double? Java BigDecimal will hit it on the nose, too. The reason for this is a thing of beauty.

How floating point numbers are represented

The gory details of floating point representation is a bit much to cover here, especially when it has already seen a great discussion in Bill Venner’s article Floating-point Arithmetic. Although the article is Java-specific, the basic details are completely valid for C#. It’s a must read if words like “mantissa” and “radix” are unfamiliar to you.

Floating point numbers are essentially represented as a fixed length integer (the “mantissa”) combined with a scale that shifts the decimal point left or right (can we still call it a “decimal” point when we’re dealing in base 2?) This shifting is determined by a radix (the “base” of the representation) raised to an exponent.

float and decimal both use a radix of 2. Hence, the number 3.125 could be represented with an integral mantissa of 25 and an exponent of -3:

25 * 2-3 = 3.125

Okay, remember this tidbit from high school? How can you tell if the decimal representation of a fraction will produce a repeating decimal? If the denominator has any factors other than 2 or 5, the decimal representation will repeat infinitely. So, fractions like 1/2, 2/5, and 3/10 may be represented exactly as 0.5, 0.4, and 0.3. But, 1/3 is a infinite series of 3s (0.33333 . . .).

When representing fractions in binary, the representation of a fraction repeats infinitely if the denominator has factors other than only 2. So, 1/2 and 3/4 are fine, but not 1/10 (the number 10 includes 5 as a factor). So, the non-repeating decimal 0.1 cannot be represented precisely in a computer.

In a typical application, we enter numbers in human usable form – base 10. But, the computer makes an inexact conversion to base 2, performs some arithmetic on the inexact numbers, and then converts the result back to base 10. No wonder roundoff problems occur!

How decimal and BigDecimal are different

Don’t fall asleep on me now, I’m working up to the grand finale!

Floating point numbers have traditionally been represented in base 2 because that’s a natural representation for computers. The arithmetic is fast and efficient.

The genius in decimal and BigDecimal is they both use base 10 internally for both storage and arithmetic. When you enter 0.1 in an application, you get precisely 0.1. These new floating point types understand numbers the same way you do, so there’s no inexact internal representation. You can add and subtract financial values all day and not lose a thing. No need to round off string values to the nearest penny.

What about performance? The answer here is predictable: base 10 arithmetic is slow – very slow. It has to be done in software, whereas base 2 floating point arithmetic can be offloaded to a specialized floating point processor. For hard performance comparison numbers, check out Wesner Moise’s Decimal Performance. But, for most financial apps, the performance hit is a non-issue – especially when compared against the cost of programmer’s trying to correct roundoff errors manually.

Some calculations will still result in repeating decimals. Calculating 1/3 will still result in a repeating decimal. But, these issues have always needed a business solution. On interest calculations, banks have always had to address whether fractional pennies in interest calculations round to the benefit of the bank or the depositor. Those problems are easily handled, but esoteric problems introduced in the black magic of the CPU aren’t. Using decimal and BigDecimal for financial calculations allows programmers to worry only about the easy problems.

Save/Share Google Yahoo! Add to Technorati Favorites Digg It Reddit
del.icio.us My Zimbio

3 Responses to “C# Decimal and Java BigDecimal Solve Roundoff Problems”

  1. Sean Says:

    I still say, use decimal everywhere unless optimizing for speed 😀

    Then again I write a lot of financial apps!

  2. BlueRaja Says:

    Seems to me using base-15 would have made more sense: we could still represent decimal numbers exactly, but now divisors with factors of 3 could also be represented exactly, such as 1/3 or 5/6. Plus, every base-15 digit conveniently fits almost exactly into a single nybble (4-bits).

  3. thomas Says:

    Thx for the article.

    .net, c# decimal is 16byte, not 12.

Leave a Reply