Chapter 4

see Concepts Introduced in Chapter 4

We will skip Section 4.9.

see Binary Numbers

It is natural for humans to use base 10 since we have 10 fingers. It is convenient to have computers use base 2 since they can represent high and low voltages. Decimal or binary is a weighted positional notation. The least significant digit is the rightmost digit. The most significant digit is the leftmost digit.

see Binary Representations

The size of a MIPS word, which is used to represent integers, is 32 bits. Since with 32 bits there are 2^32 bit patterns, it makes sense that 2^32 different values can be represented. Almost every computer today uses two's complement representation for signed integers. In two's complement there is one more negative value than positive value. If the most significant (leftmost) bit is a 1, then the value is negative. Old representations had two values that could represent zero, which caused problems.

see Sign and Zero Extension

Note that a positive two's complement value can have an infinite number of leading zeroes. Likewise, a negative two's complement value can have an infinite number of leading ones. Thus, when loading a signed byte or a signed halfword, the machine performs sign extension. An unsigned load of a byte or halfword results in zero extension.

see Two's Complement Negation

Note that two's complement negation does not always work since there is one more negative value than a position value. This is an overflow situation.

see Bases That Are a Power of Two

There are six new characters that are used to represent hexadecimal digits. Note that rightmost 3 binary digits maps to rightmost octal digit. Note that rightmost 4 binary digits maps to rightmost hexadecimal digit.

see Examples of Converting between Bases

When converting from binary to some power of two, simply group the appropriate number of bits together and then convert each group to the appropriate digit. When converting from some power of two to some other power of two, first convert to binary then convert each group of bits to the new base.

see Figure 4.3: Binary addition, showing carries from right to left.

When you add two numbers in decimal, sometimes you have to carry a 1 for the next digit. You have to do the same thing in binary.

see Subtraction by Negating and Adding

Note that there was a carry out of the last bit. When is this a problem?

see Detecting Overflow

Overflow occurs on machines since we represent values with a limited number of bits. Remember that the operands are values that can be represented. The first two lines in the table represent when overflow can occur since the signs of the operands are the same and the result is different. The next two lines represent that overflow cannot occur when the signs of the operands and the result are the same. The last four lines represent that overflow cannot occur when the signs of the operands are different. The definition of C and C++ ignores overflow. Remember that the addu, addiu, and subu do not cause an overflow exception, while add, addi, and sub can cause an overflow exception.

see Floating-Point Values

The floating-point representation is an approximation of many real values since it can only represent a limited precision. Floating-point representations allow several major benefits. (1) It allows real (noninteger) values to be represented. (2) The use of an exponent allows a wide range of values to be represented. (3) Very small fractions (close to zero) can also be represented.

see Normalized Floating-Point Values

You have seen this in normalized scientific notation. Remember that anything raised to the zero power is 1.

see IEEE Floating-Point Standard

In the past, there would be different floating-point representations on different machines. This meant that a program would behave differently on one machine than another. Also data transfered from one machine to another had to be converted to the proper format. The IEEE came up with a standard, which helps to overcome these problems. This is actually the single precision standard. The double precision standard uses 64 bits and allows a wider range of values and more importantly more precision. You are only responsible for knowing the single precision standard.

see Interpreting the IEEE FPS

The sign bit represents if the value is positive (0) or negative (1). The F represents the fractional part of the value that does not include the hidden (implied) bit. E can take on values between 0 and 255. However 0 and 255 have special interpretations. +Infinity is the value assigned as the result of an arithmetic operation when the result is a positive value that is too large to be represented. Likewise, -infinity is the value assigned as the result of an arithmetic operation when the result is a too large negative value. So the exponents with the regular meaning can range from 1 to 254 or represent after subtracting the bias -126 to 127. Note that there are two representations of zero. Note that unlike some integer operations (e.g. /0), floating-point operations do not generate exceptions since there are reserved patterns to represent these exceptional values. The more bits placed in the significand, the greater the precision that can be obtained. The more bits placed in the exponent, values with greater and smaller magnitude (further and closer to zero) can be represented.

see Example of Representing a Value in IEEE FPS

So we do the following. (1) Convert to binary. (2) Normalize the binary value by using an exponent. (3) Determine the values of S (sign), E (normalizing and adding the bias), and F (by removing the hidden bit) fields. (4) Convert the value to hexadecimal.

see Example of Determining What Value an IEEE FPS Pattern Represents

So we do the following. (1) Convert the hexadecimal value to binary. (2) Determine the values of the S, E, and F fields. (3) Plug in these values into the formula. (4) Do the arithmetic.

see What is the Largest Representable IEEE FPS Value?

We want it to be positive, so the S has to be zero. The largest exponent that is not used to represent a special case value is 254. The largest significand is when all bits are set. This is a large value. Note that we cannot represent something like 2**30+1, which is a much smaller value. The reason is that there is too much distance between the most significant bit set and the least significant bit set.

see Boolean Algebra

Boolean algebra provides a concise way of representing logic in hardware. So you can think of this as under what conditions a signal will be asserted. It involves logic equations and the rules to manipulate them. Note that the and operation takes precedence over the or operation.

see Figure 4.8: Four hardware building blocks used to construct an arithmetic unit.

A gate is an electronic circuit that produces an output signal that is a simple boolean operation on its input signals. Each output of these hardware building blocks depends only on its input, which is described as a combinational logic block. The multiplexor has the data inputs and selector input. A truth table simply depicts the output for each possible set of inputs. You can read in Appendix B how multiplexors are constructed from gates. We will see that the and and or gates and multiplexors could more than two inputs. In that case, the selector input would have to be multiple signals.

see Writing A Logic Equation from a Truth Table

Each row in the truth table corresponds to a product term, which is the product of each of the inputs or their complement, depending on if the entry in the truth table has a 0 or 1. The logic equation is the logical sum of the product terms where the output is true.

Now we will construct a simple ALU. We need a 32-bit ALU, but we will start with a 1-bit ALU. First, let's start with 1-bit AND and OR operations.

see Figure 4.9: The 1-bit logical unit for AND and OR.

This logical unit simply accepts the appropriate output based on the operation signal.

see Figure 4.10: A 1-bit adder.

Note the 3 inputs and 2 outputs. CarryIn is the CarryOut from the previous 1-bit adder.

see Figure 4.11: Input and output specification for a 1-bit adder.

Note that the Sum signal is set if the number of inputs containing a 1 is odd. Note that the CarryOut signal is set if there are two or more inputs containing a 1. Both of these can be expressed as logical equations (or gates) as shown in the text.

see Figure 4.17: (Top) A 1-bit ALU that performs AND, OR, and addition on a and b or not(b), and (bottom) a 1-bit ALU for the most significant bit.

Click on the names of individual signals within the figure to have the script frame automatically scrolled to an explanation of that signal.

Summary of top half of figure. The top half of Figure 4.17 contains the 1-bit ALU that is used for the least significant 31 bits. Each of the corresponding bits of the input values (shown as a and b in the figure) are input to elements that perform and, or and addition operations.

Operation signal. The operation signal is input to a multiplexor that allows one of 4 data results to be selected. The operation signal is set to 0 for an and instruction, a 1 for an or instruction, a 2 for an add or sub instructions, and a 3 for a slt instruction.. Note that on each cycle an and, or, and add are performed. The operation signal just indicates which result should be used.

Binvert signal. The Binvert signal is asserted when a sub or an slt instruction is to be performed. This signal has the effect of complimenting the b signal, which results in (a + not(b)) being added.

CarryIn signal. The assertion of the CarryIn signal represents that the 1-bit ALU for the previous bit has a CarryOut asserted from its addition element.

Less signal. The Less signal is asserted on the least significant 1-bit ALU when an slt instruction is to be performed.

CarryOut signal. The CarryOut signal is asserted when there is a carry out of the addition element. This signal is passed to the next 1-bit ALU.

Result signal. The Result signal is the 1-bit result of the operation performed by the 1-bit ALU. .sp 3

Summary of the bottom half of figure. The bottom half of Figure 4.17 contains the 1-bit ALU that is used for the most significant bit. There are two differences from the 1-bit ALU shown in the upper portion of the figure.

Set signal. The set signal simply indicates the result of the addition element. Since this is the most significant bit, then the set signal indicates if the result is negative.

Overflow signal. The overflow signal indicates if the result of the addition caused an overflow. Remember that overflow on addition only occurs when the sign of the two values being added are the same and the sign of the result of the addition differs from its operands.

see Figure 4.19: The final 32-bit ALU.

Click on the names of individual signals or within elements in the figure to have the script frame automatically scrolled to an explanation of that signal or element.

Summary of figure. This figure shows how the 32 1-bit ALUs can be hooked together to perform the operations of several instructions. This is called a ripple-carry adder. A single carry out of the least significant bit can ripple all the way through the adder, causing a carry out of the most significant bit.

1-bit ALU. Each 1-bit ALU accepts two corresponding bits (data signals) from the input operand values and produces a result bit as output. In addition, each 1-bit ALU accepts a CarryIn signal that indicates if the previous ALU had a CarryOut signal asserted from its ALU. results to be selected. Finally, each 1-bit ALU has a Bnegate, Operation, and Less signals as input, whose functions are explained elsewhere in the figure.

Bnegate signal. The Bnegate signal comes from the CarryIn and Binvert signals from Figure 4.17. Both of these signals are always asserted whenever a subtraction operation (sub, slt, or beq instructions) needs to be performed. Remember that the Binvert from Figure 17 was asserted to cause the complement of the b data signal to be used as input to the adder. Asserting the CarryIn of the least significant 1-bit ALU effectively causes an addition of 1. Thus, inverting the bits of the second operand and adding 1 causes a two's complement negation of the second operand.

Operation signal. The Operation signal indicates which result from the different types of elements (and, or, add, less) is to be selected.

Set signal. The set signal indicates if the most significant bit of the result is set. This indicates if the result is negative. For a slt instruction, which causes the second operand to be subtracted from the first operand, a negative result means that the first operand is less than the second operand. Remember that a slt instruction produces a value of a one that is assigned to the destination register when the first operand is less than the second operand (and a zero when the condition is not true). Thus, the 31 most significant 1-bit ALUs will produce a zero when there is a slt instruction. Only the result of the least significant 1-bit ALU can vary depending on the result of the most significant 1-bit ALU. Note that for a slt instruction, the result of the subtraction is ignored except for most significant bit.

Overflow signal. The overflow signal is asserted when the operation caused an overflow. The figure is really incomplete since it does not show how the overflow signal is used. For instance, the overflow signal should be examined to determine the Less input signal for the least significant 1-bit ALU.

Zero signal. The zero signal is asserted when all the bits in the two operand values are equal. This is accomplished by subtracting the second operand from the first and checking if the result is zero. This check is accomplished by ORing the 32 data result signals and then taking the complement of the OR result. Thus, the zero signal is only asserted when all of the inputs to the OR gate were deasserted.

see Carry-Lookahead Adder

Ripple-carry adders are considered too slow. It is possible to make a complete adder with only two levels of logic by formulating a complete logic equation for each result bit. However, the logic would be very complex. A compromise between these two approaches is the carry-lookahead adder. The generate indicates that the carry bit will be set if the gi is set. So a carry is generated when the gi condition is true. The propagate indicates that the carry bit will be set if the pi and ci are both set. So the previous carry is propagated when pi is true. The carry c(i+1) is set when either gi or pi and ci are set. We can precalculate a set of carry bits by using forward substitution. Note that each of these calculations can be accomplished in two levels of logic.

see Figure 4.22: A plumbing analogy for carry lookahead for 1 bit, 2 bits, and 4 bits using water, pipes, and valves.

The wrenches are turned to open (set) and close (clear) valves. Water will come out of the pipe, if the closest value g for the water is open or a previous consecutive set of p values is open and the g or c valve preceding it is also open.

see Carry-Lookahead Adder (cont.)

This example shows 4-bit adders using carry-lookahead logic to construct a 16-bit (not 32-bit) adder. Each Pi represents a propagate of a 4-bit adder. Each Gi represents the generate out of a 4-bit adder. Look back at the c4 equation on the previous carry-lookahead slide. The P0 is simply the last term without the c0. The G0 is simply the first 4 terms. The last 4 equations should look familiar. It is the same as the last 4 equations on the previous slide. The only difference is that we are dealing with generates and propagates at a higher level of abstraction.

see Figure 4.24: Four 4-bit ALUs using carry lookahead to form a 16-bit adder.

The propagates and generates are produced by each 4-bit adder. The carries come from the carry-lookahead unit. Carry lookahead makes carries faster since fewer gates are required to send the carry signal (for a 4-bit adder) ahead and the next adder in the sequence can start calculating valid results sooner. A ripple-carry adder would take 32 (16*2) gate delays to calculate the carry out of the most significant bit of a 16-bit adder. To calculate C4 of a 16-bit carry-lookahead adder would require 5 levels of logic. So for our example, the carry-lookahead adder is 6.4 (32/5) times as fast as the ripple-carry adder.

see Integer Multiplication

Addition can cause one carry out bit. But a multiplication result can require twice as many bits as each of its operands. So the chances of overflow are greater. We will see that multiplication is essentially a series of adds and shifts.

show Figure 4.32: The third multiplication algorithm

Product0 indicates the least significant bit in the product. The product is initialized to contain the value of the multiplier. The product is right shifted at each repetition so we can test the least significant bit to see if we should perform the addition. Eventually the multiplier is completely shifted out of the product. Note that when we are performing the right shift, the most significant bit remains the same. This is referred to as an arithmetic right shift and allows the algorithm to work correctly when multiplying negative values.

show Figure 4.33: Multiply example using third algorithm in Figure 4.32.

So the left half of the product is initialized to zero and the right half is initialized to the multiplier. The multiplicand just stays the same at each step. We perform additions the first two repetitions since the least significant bit of the product is 1 at that point. We don't perform additions the last two repetitions since the least significant bit of the multiplier is zero.

show Booth's Algorithm for Multiplication

A shift can be performed cheaper than an add. 0's don't require an addition. The trick with Booth's algorithm is that a sequence of 1's can be handled with 1 addition and 1 subtraction. Note that this does not guarantee a performance improvement since we could have a value of alternating zeroes and ones. Booth's algorithm can be beneficial for machines that have multiplies that require a varying number of cycles to complete.

see Integer Division

We will see that division is essentially a series of subtracts and shifts. We will have to detect divide by zero (infinity).

see Figure 4.40: The third division algorithm has just two steps.

Place the dividend in the right half of the remainder register. We subtract the divisor register from the left half of the remainder. If the result is >= 0, then we know that the divisor fit into the dividend and we set the rightmost bit of the remainder to 1. If the result is < 0, then we have to restore the previous value of the remainder register by adding the divisor register to it (undo the subtract). We keep doing this for 32 repetitions. We finally have the quotient in the right half of the remainder register and the remainder in the left half.

see Figure 4.42: Division example using third algorithm in Figure 4.40

So the right half of the remainder is initialized to the dividend and the left half is the sign extension of the dividend. Step 0 also left shifts the remainder register by 1. Steps 1 and 2 find that after subtracting the divisor from the left half of the remainder register, the result is negative. So the old value is restored and left shifted by 1. Steps 3 and 4 find that after subtracting the divisor from the left half of the remainder register, the result is positive. So the value in the left half does not get restored and the whole register value is left shifted. Finally we right shift the left half of the remainder register by 1 to get rid of the effect of that final left shift.

see General Forms of a MIPS Integer Multiply or Divide Instruction

Only 2 source registers are explicitly given. The destination of the integer multiply and divide are implicit hi and lo registers. These pair of registers hold the 64-bit product for the multiply. The high portion can be inspected to see if there is an overflow (nonzero bit for nonnegative right half or zero bit for negative right half). For the divide, the lo portion contains the quotient and the hi portion contains the remainder. The last two instructions show how we can move the values from these special registers to conventional registers. The likely reason that special registers (hi and lo) are used is due to multiply and divide instructions requiring several cycles to execute.

see Fallacies

Fallacy: Floating-point addition is associative; that is, x+(y+z) = (x+y)+z. The problem is that floating-point numbers are represented with limited precision. For instance, if the result has the most significant bit set a far distance from the least significant bit, then the least significant bit will often be truncated away. This can happen when x is a very large magnitude negative number, y is a very large magnitude positive number, and z is a small number. Thus, the results can differ depending on the order that the operations applied, which can inhibit many compiler optimizations. This is less of a problem with integer operations, except for the possibility of introducing overflow.
Fallacy: Only theoretical mathematicians care about floating-point accuracy. Remember the Pentium where errors on floating-point divides could occur on the 4th to 15th decimal digits to the right of the decimal point. This error cost Intel an estimated $300 million in recalled chips.