July 21, 2013

Two's Complement And Absolute Values

If I ask you what the result of Math.abs(-10) is, would you guess it’s 10? How about Math.abs(-2147483648)? If you think the answer is 2147483648, you’re wrong.

How it all starts

A bit can hold only one of two values, say 0 and 1. An 8-bit value, for example, is a combination of 8 bits glued together:

001: [0] [0] [0] [0] [0] [0] [0] [0]
002: [0] [0] [0] [0] [0] [0] [0] [1]
003: [0] [0] [0] [0] [0] [0] [1] [0]
004: [0] [0] [0] [0] [0] [0] [1] [1]
...
256: [1] [1] [1] [1] [1] [1] [1] [1]

There are 256 possible values. Being presented with the table above and no additional information, a natural question arises: what do this combinations of bits represent?

Knowing computers operate with binary numbers, a good guess would be this are just binary representations of numbers. Lo and behold, you’ve discovered an unsigned 8-bit integer representation system. Converting to decimal, the values yield:

001: [0] [0] [0] [0] [0] [0] [0] [0] = 0
002: [0] [0] [0] [0] [0] [0] [0] [1] = 1
003: [0] [0] [0] [0] [0] [0] [1] [0] = 2
004: [0] [0] [0] [0] [0] [0] [1] [1] = 3
...
256: [1] [1] [1] [1] [1] [1] [1] [1] = 255

Notice what unsigned means - the values we’ve calculated are all missing a sign, so we naturally assume they’re non-negative. Also, we’re only able to represent integers from 0 to 255.

To add negative numbers and make our system signed, let’s make a wild guess and choose the first bit to represent the sign: 0 means the number will be positive, 1 negative:

[0] [0] [0] [0] [0] [0] [0] [0] = +0
[0] [0] [0] [0] [0] [0] [0] [1] = +1
[0] [0] [0] [0] [0] [0] [1] [0] = +2
...
[0] [1] [1] [1] [1] [1] [1] [1] = +127

[1] [0] [0] [0] [0] [0] [0] [0] = -0
[1] [0] [0] [0] [0] [0] [0] [1] = -1
[1] [0] [0] [0] [0] [0] [1] [0] = -2
...
[1] [1] [1] [1] [1] [1] [1] [1] = -127

Voila! Our very first 8-bit signed integer system. Being very intuitive, this is a well known system called signed magnitude, used by some ancient computers, such as IBM 7090.

Enter Two’s Complement

Through time, people had come up with clever ways of storing integers, so that common math problems are simple to implement, efficient, and have no duplicate values.

For an arbitrary n-bit binary number, to get its opposite representation, first invert the number, then add 1.

This simple algorithm forms Two’s Complement, the most commonly used signed number representation system in use today.

Let’s do an example: find the opposite value of the binary number 00001010:

inverted(00001010) = 11110101
    add1(11110101) = 11110110

Let’s check what the opposite number of 11110110 is - if the algorithm is well-defined, the result should be 00001010:

inverted(11110110) = 00001001
    add1(00001001) = 00001010

Now for a more juicy example. What’s the opposite binary number of 00000000?

inverted(00000000) =  11111111
    add1(11111111) = 100000000

The result is a 9-bit number in an 8-bit number system. This is called an overflow since the value is, well, overflowing the 8-bit size restriction. The overflow is usually discarded, leaving us with the first eight bits from the right, 00000000. We just showed that there is exactly one representation of 0.

Sign

In Two’s Complement, the most significant bit represents the sign. 0 means the number will be positive, 1 negative. This means that with n bits, you can only use (n-1) bits to represent the number, as one bit is reserved to denote the sign.

You may think of, e.g., 01001010 as a compound value consisting of 0 (sign) and 1001010 (actual binary number).

But I can’t think in binary

Converting a binary value 01001010 to decimal is simple. Ignoring the most significant bit, we get

\[0 \times 2^{0} + 1 \times 2^{1} + 0 \times 2^{2} + 1 \times 2^{3} + 1 \times 2^{6} = 138\]

What about 10001010?

This is a negative value, since the most significant bit is 1. To get the actual value, we can apply Two’s Complement algorithm, ignoring the first bit:

inverted(0001010) = 1110101
    add1(1110101) = 1110110

Then,

\[0 \times 2^{0} + 1 \times 2^{1} + 1 \times 2^{3} + 0 \times 2^{4} + 1 \times 2^{5} + 1 \times 2^{6} + 1 \times 2^{7} = 118\]

Because the most significant bit is 1, the decimal number we calculated is negative. Therefore, 10001010 == -118.

What about the bounds?

In 8-bit unsigned integer system, we could represent integers from 0 to 255. With 8-bit sign and magnitude, we could represent numbers from -127 to 127. What about Two’s Complement bounds?

Let’s first look at 3-bit two’s complement binary numbers and its decimal values:

Bits	Values
000	0
001	1
010	2
011	3
100	-4
101	-3
110	-2
111	-1

Since one bit represents the sign, there are only two bits available for binary numbers. The upper bound seems to be \(3 = (2^{2} - 1) \). The lower bound is \( -4 = -2^{2} \)

Let’s do the same thing for 4-bit numbers:

Bits	Values
0000	0
0001	1
0010	2
0011	3
0100	4
0101	5
0110	6
0111	7
1000	-8
1001	-7
1010	-6
1011	-5
1100	-4
1101	-3
1110	-2
1111	-1

Again, one bit represents the sign, so there are three bits available for binary numbers. The upper bound seems to be \( 7 = 2 ^ {3} - 1 \). The lower bound is \( -8 = -2^{3} \)

Notice the bounds pattern for number of bits:

Number of bits	Lower bound	Upper bound
3	-4	3
4	-8	7
5	-16	15
...	...	...
n	\( -2^{n - 1} \)	\( 2^{n - 1} - 1 \)

The grand finale

Let’s look at one more example. What is the opposite value of binary number 100?

inverted(100) = 011
    add1(011) = 100

The opposite value of 100 is the exact same value. Weird. Let’s do one more. What’s the opposite value of 1000?

inverted(1000) = 0111
    add1(0111) = 1000

Applying inductive reasoning, this holds for any n-bit number. But what exactly are this values?

Number of bits	Bit value	Decimal value
3	100	-4
4	1000	-8
5	10000	-16
...	...	...
n		\(-2^{n - 1}\)

The values are lower bounds. We’ve shown that in Two’s Complement, the opposite value of a lower bound is still the lower bound.

This is relevant because -2147483648 == Integer.MIN_VALUE, a lower bound for int type in Java. And as such, Math.abs(-2147483648) == -2147483648.

← → Top