CompOrg Fall 2000 - Test #1 Review


Topics

Performance and Benchmarks
(Chapter 2)

  • Terminology
    • Performance
    • Execution Time
    • Throughput
    • Clock Cycle
    • Clock Rate
    • CPI (Cycles Per Instruction)
    • Instruction Count
    • Benchmark
    • Arithmetic Mean
    • Geometric Mean
    • Normalized Time
    • MIPS (Million Instructions per Second)
  • Performance and Execution Time Equations
  • Relative Performance
  • Improving Performance
    • Changing CPI
    • Changing Clock Rate
    • Changing Instruction Count
    • Changing Compilers
  • Benchmarks
    • Toy Programs vs. Real Applications
    • Summarizing - Arithemetic vs. Geometric Mean
    • Cheating
    • Spec95 (general idea - not details)
  • Using MIPS as a performance measure
    • Why it's a bad idea

Boolean Algebra and Logic Design
(Appendix B)

  • Boolean Algebra Operators
    • AND
    • OR
    • NOT
  • Boolean Algebra Expressions
  • Other Operators
    • NAND
    • NOR
  • Functionally complete set of operators (universal operators).
    • Prove NOR is a functionally complete set
    • Prove NAND is a functionally complete set
  • Laws of Boolean Algebra
    • Identities
    • Inverses
    • Zeros and Ones
    • Commutative
    • Associative
    • Distributive
    • DeMorgan's Laws
  • Boolean Functions
  • Truth Tables
  • Logic Gates
  • Bill Gates
  • Combinational Logic
    • Decoders
    • Multiplexors
    • Sum of Products form
    • Product of Sums form
    • Minimization
      • Be able to do some trival algebraic minimization
  • Sequential Logic
    • Difference between Combinational and Sequential Logic
    • S-R Latch
    • Clocked S-R Latch
    • D Flip-Flop

Questions from the book and answers

Problem 2.13

The key is to realize that relative performance is all that is asked for - and the problem states that the instruction count is the same for both machines and compilers.

Using C1

For M1, the execution time can be calculated as:


and for M1:


So M1 is faster by a factor of 16/14.5 = about 1.1 times faster than M2.

Using C2

For M1 the execution time now looks like this:


and for M1:


So with Compiler C2, M2 is about 1.1 times faster than M1.

Using C3

Repeat the same calculations as above, but use the instruction mix from compiler C3 and you get CPI of 5.4 for M1 and 2.8 for M2. Since compiler C3 provides the lowest CPI for either processor - compiler C3 is the one to use regardless of which machine you buy! Machine M1 is 1.04 times faster if you use compiler C3.

Problem 2.18

CPI for the machines can be calculated as:



Problem 2.26

Total Execution time for A is 1001, for B is 110 and for C is 40.

Computer C is the fastest

C is about 25 times faster than A

C is about 2.75 times faster than B


Problem 2.44

Without improvements the program spends 20 seconds on multiplication, 50 seconds for memory access and 30 for everything else.

If multiplication is 4 times faster, the multiplication part of the program will take on 5 seconds, and the total running time will be 85 seconds. This corresponds to a speedup of 100/85 = 1.18.

If memory is 2 times faster the memory accesses will now take 25 seconds. The total running time will be 75 seconds, for a speedup of 1.33.

If both improvements are made the total running time will be 5+25+30= 60 seconds for a speedup of 100/60 = 1.67.


Problem 2.45

The solution requires that the new program spend different percentages of time on the 4 types of operations (multiplication, memory, other) so that the speedup we see if we make multiplication 4 times faster is the same as the speedup we see if we make memory 2 times faster. A little algebra is required.

X is the percentage of time spent in multiplication

Y is the percentage of time spent for memory access

100-X-Y is the percentage of time spent doing other stuff

Without any improvements the program takes 100 seconds. To solve the problem we need to find values for X and Y such that the total time with either one improvement is the same.

With multiplication 4 times faster the new running time is:


With memory 2 times faster the new running time is:


Equating these two expressions and reducing reveals the following relationship between X and Y:


Using percentages of X=20% Multiplication, Y=30% Memory and 50% for other satisfies the conditions and will result in the same speedup for either improvement.


Problem B.3

We did this in class.

Problem B.4

To prove that NAND is a universal operator (a functionally complete set) we can show how to implement the primary operators OR, AND and NOT using just the NAND operator.

Here is the truth table for NAND:

ABA NAND B
001
011
101
110

To build a NOT using only NAND we just need to recognize that A NAND A is always the same as NOT A. So the following shows how we can use a NAND gate to implement NOT:


To build a OR using only NAND we can use DeMorgan's law to convert A OR B as follows:


Using this and our NAND based inverter, the following implements OR using only NAND gates:


To build AND using only NAND we just invert the output of a NAND gate:



Problem B.10

Don't worry if you can't do the 4th function, we haven't yet covered 2's complement representation.

Here is a truth table that includes all 4 functions:


And here are boolean equations:



Problem B.14

The diagram below shows what the switch does for each of the possible states of S:


Here is the truth table:


Here are boolean equations for C and D.