Chapter 4
see Concepts Introduced in Chapter 4
We will skip Section 4.9.
see Binary Numbers
It is natural for humans to use base 10 since we have 10 fingers.
It is convenient to have computers use base 2 since they can represent
high and low voltages.
Decimal or binary is a weighted positional notation.
The least significant digit is the rightmost digit.
The most significant digit is the leftmost digit.
see Binary Representations
The size of a MIPS word, which is used to represent integers, is 32 bits.
Since with 32 bits there are 2^32 bit patterns, it makes sense that 2^32
different values can be represented.
Almost every computer today uses two's complement representation for
signed integers.
In two's complement there is one more negative value than positive value.
If the most significant (leftmost) bit is a 1, then the value is negative.
Old representations had two values that could represent zero, which caused
problems.
see Sign and Zero Extension
Note that a positive two's complement value can have an infinite number
of leading zeroes.
Likewise, a negative two's complement value can have an infinite number
of leading ones.
Thus, when loading a signed byte or a signed halfword, the machine
performs sign extension.
An unsigned load of a byte or halfword results in zero extension.
see Two's Complement Negation
Note that two's complement negation does not always work since there is
one more negative value than a position value.
This is an overflow situation.
see Bases That Are a Power of Two
There are six new characters that are used to represent hexadecimal
digits.
Note that rightmost 3 binary digits maps to rightmost octal digit.
Note that rightmost 4 binary digits maps to rightmost hexadecimal digit.
see Examples of Converting between Bases
When converting from binary to some power of two, simply group the
appropriate number of bits together and then convert each group to
the appropriate digit.
When converting from some power of two to some other power of two, first
convert to binary then convert each group of bits to the new base.
see Figure 4.3: Binary addition, showing carries from right to left.
When you add two numbers in decimal, sometimes you have to carry a 1 for
the next digit.
You have to do the same thing in binary.
see Subtraction by Negating and Adding
Note that there was a carry out of the last bit.
When is this a problem?
see Detecting Overflow
Overflow occurs on machines since we represent values with a limited number
of bits.
Remember that the operands are values that can be represented.
The first two lines in the table represent when overflow can occur since
the signs of the operands are the same and the result is different.
The next two lines represent that overflow cannot occur when the signs of
the operands and the result are the same.
The last four lines represent that overflow cannot occur when the signs
of the operands are different.
The definition of C and C++ ignores overflow.
Remember that the addu
,
addiu
, and
subu
do not cause an overflow exception,
while add
, addi
,
and sub
can cause an overflow exception.
see Floating-Point Values
The floating-point representation is an approximation of many real values
since it can only represent a limited precision.
Floating-point representations allow several major benefits.
(1) It allows real (noninteger) values to be represented.
(2) The use of an exponent allows a wide range of values to be represented.
(3) Very small fractions (close to zero) can also be represented.
see Normalized Floating-Point Values
You have seen this in normalized scientific notation.
Remember that anything raised to the zero power is 1.
see IEEE Floating-Point Standard
In the past, there would be different floating-point representations on
different machines.
This meant that a program would behave differently on one machine than
another.
Also data transfered from one machine to another had to be converted to
the proper format.
The IEEE came up with a standard, which helps to overcome these problems.
This is actually the single precision standard.
The double precision standard uses 64 bits and allows a wider range of
values and more importantly more precision.
You are only responsible for knowing the single precision standard.
see Interpreting the IEEE FPS
The sign bit represents if the value is positive (0) or negative (1).
The F represents the fractional part of the value that does not include
the hidden (implied) bit.
E can take on values between 0 and 255.
However 0 and 255 have special interpretations.
+Infinity is the value assigned as the result of an arithmetic operation
when the result is a positive value that is too large to be represented.
Likewise, -infinity is the value assigned as the result of an
arithmetic operation when the result is a too large negative value.
So the exponents with the regular meaning can range from 1 to 254 or
represent after subtracting the bias -126 to 127.
Note that there are two representations of zero.
Note that unlike some integer operations (e.g. /0), floating-point
operations do not generate exceptions since there are reserved patterns
to represent these exceptional values.
The more bits placed in the significand, the greater the precision that
can be obtained.
The more bits placed in the exponent, values with greater and smaller
magnitude (further and closer to zero)
can be represented.
see Example of Representing a Value in IEEE FPS
So we do the following.
(1) Convert to binary.
(2) Normalize the binary value by using an exponent.
(3) Determine the values of S (sign), E (normalizing and adding the bias), and
F (by removing the hidden bit) fields.
(4) Convert the value to hexadecimal.
see Example of Determining What Value an IEEE FPS Pattern Represents
So we do the following.
(1) Convert the hexadecimal value to binary.
(2) Determine the values of the S, E, and F fields.
(3) Plug in these values into the formula.
(4) Do the arithmetic.
see What is the Largest Representable IEEE FPS Value?
We want it to be positive, so the S has to be zero.
The largest exponent that is not used to represent a special case value
is 254.
The largest significand is when all bits are set.
This is a large value.
Note that we cannot represent something like 2**30+1, which is a much
smaller value.
The reason is that there is too much distance between the most significant
bit set and the least significant bit set.
see Boolean Algebra
Boolean algebra provides a concise way of representing logic in hardware.
So you can think of this as under what conditions a signal will be asserted.
It involves logic equations and the rules to manipulate them.
Note that the and operation takes precedence over the or
operation.
see Figure 4.8: Four hardware building blocks used to construct an
arithmetic unit.
A gate is an electronic circuit that produces an output signal that is a
simple boolean operation on its input signals.
Each output of these hardware building blocks depends only on its input,
which is described as a combinational logic block.
The multiplexor has the data inputs and selector input.
A truth table simply depicts the output for each possible set of inputs.
You can read in Appendix B how multiplexors are constructed from gates.
We will see that the and and or gates and multiplexors
could more than two inputs.
In that case, the selector input would have to be multiple signals.
see Writing A Logic Equation from a Truth Table
Each row in the truth table corresponds to a product term, which is the
product of each of the inputs or their complement, depending on if the
entry in the truth table has a 0 or 1.
The logic equation is the logical sum of the product terms where the
output is true.
Now we will construct a simple ALU.
We need a 32-bit ALU, but we will start with a 1-bit ALU.
First, let's start with 1-bit AND and OR operations.
see Figure 4.9: The 1-bit logical unit for AND and OR.
This logical unit simply accepts the appropriate output based on
the operation signal.
see Figure 4.10: A 1-bit adder.
Note the 3 inputs and 2 outputs.
CarryIn is the CarryOut from the previous 1-bit adder.
see Figure 4.11: Input and output specification for a 1-bit adder.
Note that the Sum signal is set if the number of inputs
containing a 1 is odd.
Note that the CarryOut signal is set if there are two or more
inputs containing a 1.
Both of these can be expressed as logical equations (or gates) as shown
in the text.
see Figure 4.17: (Top) A 1-bit ALU that performs AND, OR, and addition on
a and b or not(b), and (bottom) a 1-bit ALU for the most significant bit.
Click on the names of individual signals within the figure to have the
script frame automatically scrolled to an explanation of that signal.
Summary of top half of figure.
The top half of Figure 4.17 contains the 1-bit ALU that is used for the
least significant 31 bits.
Each of the corresponding bits of the input values (shown as a and
b in the figure) are input to elements that perform and,
or and addition operations.
Operation signal.
The operation signal is input to a multiplexor that allows one of 4 data
results to be selected.
The operation signal is set to 0 for an
and
instruction,
a 1 for an or
instruction,
a 2 for an add
or
sub
instructions, and
a 3 for a slt
instruction..
Note that on each cycle an and, or, and add are
performed.
The operation signal just indicates which result should be used.
Binvert signal.
The Binvert signal is asserted when a
sub
or an slt
instruction is to be performed.
This signal has the effect of complimenting the b signal, which
results in (a + not(b)) being added.
CarryIn signal.
The assertion of the CarryIn signal represents that the 1-bit ALU
for the previous bit has a CarryOut asserted from its addition
element.
Less signal.
The Less signal is asserted on the least significant 1-bit ALU
when an slt
instruction is to be performed.
CarryOut signal.
The CarryOut signal is asserted when there is a carry out of the
addition element.
This signal is passed to the next 1-bit ALU.
Result signal.
The Result signal is the 1-bit result of the operation performed
by the 1-bit ALU.
.sp 3
Summary of the bottom half of figure.
The bottom half of Figure 4.17 contains the 1-bit ALU that is used for
the most significant bit.
There are two differences from the 1-bit ALU shown in the upper portion
of the figure.
Set signal.
The set signal simply indicates the result of the addition element.
Since this is the most significant bit, then the set signal
indicates if the result is negative.
Overflow signal.
The overflow signal indicates if the result of the addition caused
an overflow.
Remember that overflow on addition only occurs when the sign of the two
values being added are the same and the sign of the result of the addition
differs from its operands.
see Figure 4.19: The final 32-bit ALU.
Click on the names of individual signals or within elements in the figure to
have the script frame automatically scrolled to an explanation of that signal
or element.
Summary of figure.
This figure shows how the 32 1-bit ALUs can be hooked together
to perform the operations of several instructions.
This is called a ripple-carry adder.
A single carry out of the least significant bit can ripple all the
way through the adder, causing a carry out of the most significant
bit.
1-bit ALU.
Each 1-bit ALU accepts two corresponding bits (data signals) from the
input operand values and produces a result bit as output.
In addition, each 1-bit ALU accepts a CarryIn signal that indicates
if the previous ALU had a CarryOut signal asserted from its ALU.
results to be selected.
Finally, each 1-bit ALU has a Bnegate, Operation, and
Less signals as input, whose functions are explained elsewhere
in the figure.
Bnegate signal.
The Bnegate signal comes from the CarryIn and
Binvert signals from Figure 4.17.
Both of these signals are always asserted whenever a subtraction operation
(sub
, slt
, or
beq
instructions) needs to be performed.
Remember that the Binvert from Figure 17 was asserted to cause the
complement of the b data signal to be used as input to the adder.
Asserting the CarryIn of the least significant 1-bit ALU
effectively causes an addition of 1.
Thus, inverting the bits of the second operand and adding 1 causes a
two's complement negation of the second operand.
Operation signal.
The Operation signal indicates which result from the different
types of elements (and, or, add, less) is to be selected.
Set signal.
The set signal indicates if the most significant bit of the
result is set.
This indicates if the result is negative.
For a slt
instruction, which causes the
second operand to
be subtracted from the first operand, a negative result means that
the first operand is less than the second operand.
Remember that a slt
instruction produces a
value of a one that is assigned to the destination register when the first
operand is less than the second operand (and a zero when the condition
is not true).
Thus, the 31 most significant 1-bit ALUs will produce a zero when
there is a slt
instruction.
Only the result of the least significant 1-bit ALU can vary depending
on the result of the most significant 1-bit ALU.
Note that for a slt
instruction, the result
of the subtraction is ignored except for most significant bit.
Overflow signal.
The overflow signal is asserted when the operation caused an
overflow.
The figure is really incomplete since it does not show how the overflow
signal is used.
For instance, the overflow signal should be examined to determine
the Less input signal for the least significant 1-bit ALU.
Zero signal.
The zero signal is asserted when all the bits in the two operand
values are equal.
This is accomplished by subtracting the second operand from the first and
checking if the result is zero.
This check is accomplished by ORing the 32 data result signals and then
taking the complement of the OR result.
Thus, the zero signal is only asserted when all of the inputs to
the OR gate were deasserted.
see Carry-Lookahead Adder
Ripple-carry adders are considered too slow.
It is possible to make a complete adder with only two levels of logic
by formulating a complete logic equation for each result bit.
However, the logic would be very complex.
A compromise between these two approaches is the carry-lookahead
adder.
The generate indicates that the carry bit will be set if
the gi is set.
So a carry is generated when the gi condition is true.
The propagate indicates that the carry bit will be set if
the pi and ci are both set.
So the previous carry is propagated when pi is true.
The carry c(i+1) is set when either gi or pi and
ci are set.
We can precalculate a set of carry bits by using forward substitution.
Note that each of these calculations can be accomplished in two levels
of logic.
see Figure 4.22: A plumbing analogy for carry lookahead for 1 bit, 2 bits,
and 4 bits using water, pipes, and valves.
The wrenches are turned to open (set) and close (clear) valves.
Water will come out of the pipe, if the closest value g for the water
is open or a previous consecutive set of p values is open and the
g or c valve preceding it is also open.
see Carry-Lookahead Adder (cont.)
This example shows 4-bit adders using carry-lookahead logic to construct
a 16-bit (not 32-bit) adder.
Each Pi represents a propagate of a 4-bit adder.
Each Gi represents the generate out of a 4-bit adder.
Look back at the c4 equation on the previous carry-lookahead slide.
The P0 is simply the last term without the c0.
The G0 is simply the first 4 terms.
The last 4 equations should look familiar.
It is the same as the last 4 equations on the previous slide.
The only difference is that we are dealing with generates
and propagates at a higher level of abstraction.
see Figure 4.24: Four 4-bit ALUs using carry lookahead to form a
16-bit adder.
The propagates and generates are produced by each 4-bit
adder.
The carries come from the carry-lookahead unit.
Carry lookahead makes carries faster since fewer gates are required to
send the carry signal (for a 4-bit adder) ahead and the next adder in
the sequence can start calculating valid results sooner.
A ripple-carry adder would take 32 (16*2) gate delays to calculate
the carry out of the most significant bit of a 16-bit adder.
To calculate C4 of a 16-bit carry-lookahead adder would require
5 levels of logic.
So for our example, the carry-lookahead adder is 6.4 (32/5) times as
fast as the ripple-carry adder.
see Integer Multiplication
Addition can cause one carry out bit.
But a multiplication result can require twice as many bits as each of
its operands.
So the chances of overflow are greater.
We will see that multiplication is essentially a series of adds and
shifts.
show Figure 4.32: The third multiplication algorithm
Product0 indicates the least significant bit in the product.
The product is initialized to contain the value of the multiplier.
The product is right shifted at each repetition so we can test the least
significant bit to see if we should perform the addition.
Eventually the multiplier is completely shifted out of the product.
Note that when we are performing the right shift, the most significant
bit remains the same.
This is referred to as an arithmetic right shift and allows the algorithm
to work correctly when multiplying negative values.
show Figure 4.33: Multiply example using third algorithm in Figure 4.32.
So the left half of the product is initialized to zero and the right half is
initialized to the multiplier.
The multiplicand just stays the same at each step.
We perform additions the first two repetitions since the least significant
bit of the product is 1 at that point.
We don't perform additions the last two repetitions since the least
significant bit of the multiplier is zero.
show Booth's Algorithm for Multiplication
A shift can be performed cheaper than an add.
0's don't require an addition.
The trick with Booth's algorithm is that a sequence of 1's can be handled
with 1 addition and 1 subtraction.
Note that this does not guarantee a performance improvement since we could
have a value of alternating zeroes and ones.
Booth's algorithm can be beneficial for machines that have multiplies that
require a varying number of cycles to complete.
see Integer Division
We will see that division is essentially a series of subtracts and shifts.
We will have to detect divide by zero (infinity).
see Figure 4.40: The third division algorithm has just two steps.
Place the dividend in the right half of the remainder register.
We subtract the divisor register from the left half of the remainder.
If the result is >= 0, then we know that the divisor fit into the
dividend and we set the rightmost bit of the remainder to 1.
If the result is < 0, then we have to restore the previous value of the
remainder register by adding the divisor register to it (undo the subtract).
We keep doing this for 32 repetitions.
We finally have the quotient in the right half of the remainder register
and the remainder in the left half.
see Figure 4.42: Division example using third algorithm in Figure 4.40
So the right half of the remainder is initialized to the dividend and the
left half is the sign extension of the dividend.
Step 0 also left shifts the remainder register by 1.
Steps 1 and 2 find that after subtracting the divisor from the left half
of the remainder register, the result is negative.
So the old value is restored and left shifted by 1.
Steps 3 and 4 find that after subtracting the divisor from the left half
of the remainder register, the result is positive.
So the value in the left half does not get restored and the whole register
value is left shifted.
Finally we right shift the left half of the remainder register by 1 to
get rid of the effect of that final left shift.
see General Forms of a MIPS Integer Multiply or Divide Instruction
Only 2 source registers are explicitly given.
The destination of the integer multiply and divide are implicit hi
and lo registers.
These pair of registers hold the 64-bit product for the multiply.
The high portion can be inspected to see if there is an overflow
(nonzero bit for nonnegative right half or zero bit for negative right half).
For the divide, the lo portion contains the quotient and the hi
portion contains the remainder.
The last two instructions show how we can move the values from these special
registers to conventional registers.
The likely reason that special registers (hi and lo) are used
is due to multiply and divide instructions requiring several cycles to execute.
see Fallacies
-
Fallacy: Floating-point addition is associative; that is,
x+(y+z) = (x+y)+z.
The problem is that floating-point numbers are represented with limited
precision.
For instance, if the result has the most significant bit set a far distance
from the least significant bit, then the least significant bit will often
be truncated away.
This can happen when x is a very large magnitude negative number,
y is a very large magnitude positive number, and z is a
small number.
Thus, the results can differ depending on the order that the operations
applied, which can inhibit many compiler optimizations.
This is less of a problem with integer operations, except for the possibility
of introducing overflow.
-
Fallacy: Only theoretical mathematicians care about floating-point
accuracy.
Remember the Pentium where errors on floating-point divides could occur on
the 4th to 15th decimal digits to the right of the decimal point.
This error cost Intel an estimated $300 million in recalled chips.