Test #2 will be given in class on Thursday, November
21st. The test is closed book, no notes or computers are allowed.
An IA32 Instruction reference will be distributed with the test
(this covers only the instructions that we have covered so
far). Here is the reference: cribsheet.pdf
A copy of figure 4.21 will also be provided during the test,
as well as a description of the Y86 instruction set.
It is expected that you know HCL (no reference will be provided).
| Topic |
What to know |
What to expect on the test |
|
| IA32 Assembly Language |
IA32 Assembly language (IA32 Instruction Set).
Subroutines (gcc/gas calling conventions).
|
Be able to write IA32 Assembly language programs and
subroutines (subroutines that will work with code generated
by gcc). You may be given a description of what the code
should do in English, or as a C program/function.
Given some IA32 Assembly language code, provide
C code that does the same thing.
Know the stack! Be able to answer questions about
how subroutines use the stack.
|
|
| Processor Architecture - Sequential |
Y86 Instruction Set
Simple Combinational Circuits
HCL
Memory and Clocking
Stages of Y86 Instructions:
- Fetch
- Decode
- Execute
- Memory
- Write Back
- PC Update
HCL for Y86 Sequential Implementation
|
Be able to rewrite some IA32 code segments using only
the Y86 instruction set
Be able to describe some simple combinational circuits
using HCL.
Be able to describe the operation of simple memory elements
(registers), specifically how a value is read from a register and
how a new value is stored in a register (what signals are necessary,
when do things happen, etc.)
Be able to describe what happens during each of the 6 stages
for any Y86 Instruction, including new instructions (same idea as
the homework assignment).
Given Figure 4.21 and a list of controls, derive the HCL
expressions the describe each control to support various some set
of instructions (or modify a given set of HCL expressions).
|
|
| Pipelining and a Y86 Pipeline |
Pipelining and it's impact on performance.
Pipeline registers
Data Hazards
Control Hazards
Y86 Pipeline
- modification to Y86 PC Update stage
- conditional branching issues
- why is RET special?
|
Given a description of a sequential implementation of an
instruction set, and a pipelined version, be able to discuss the
potential performance improvements.
Be able to discuss data and control hazards (general issues).
Be able to identify data and control hazards in a sequence of
Y86 instructions (assuming the pipeline described in the Text).
Be able to discuss ways to (attempt to) avoid stalling the pipeline given a
a sequence of Y86 instructions.
Be able to describe why the pipelined version of Y86 needs to
have the PC Update stage moved to the beginning.
|
|
| Code Optimization |
Loop inefficiencies.
C function calls (and why compilers don't
attempt to reduce the number of function calls).
Avoiding memory references
|
Be able to predict the relative performance of
two code segments (C or assembly language).
Given some C/IA32 code, be able to make some
improvments and discuss the impact on performance.
Given a simple loop, be able to do some (simple) loop un-rolling
Given some code, be able to discuss what optimizations
are possible, and which if these could be handled by a compiler.
|
|
| Memory Hierarchy and Caching |
Basic concepts related to the access time of
DRAM, SDRAM and hard disks.
General issues related to memory hierarchies
Principles of Temporal and Spatial Locality
Cache Architecture
- Blocks and Slots
- Direct Mapping
- Set-Associative
- Tag and Valid bits
- Replacement Policy
- Write Policy
Cache Friendly Code
- Memory access patterns
- The "Memory Mountain" (sect 6.6.1)
|
Be able to describe the timings involved in accessing
SDRAM, DRAM and hard disk.
Be able to describe the principles of locality, and
why these matter to cache designers
Be able to determine the size of a cache (total number of
bits of storage required) given a cache design.
Be able to compare two versions of some code in terms of
memory access patterns and which would be better if a cache is
used.
Be able to discuss the tradeoffs of implementing various
cache designs (direct-mapped vs set assoc., replacement and write policies).
Be able to make some predictions about the performance of
code given specific cache designs (code like the code used to produce the
"memory mountain".
|
Practice Problems
Practice Problem 3.3 (know leal!)
Practice Problem 3.24 (Buffer Overflow)
Problem 3.31 (Assembly to C)
Problem 3.36 (Assembly to C)
Practice problem 4.3 (Y86 Instruction Set)
Practice Problem 4.8 (HCL)
Practice Problem 4.14 (HCL)
Practice Problem 4.21 (Pipelining)
Practice Problem 4.22 (Pipeline Performance)
Identify the hazards in the following code, and show any
stall cycles that need to be inserted in order to avoid all the hazards
irmovl $100,%edx
addl %edx,%eax
rmmovl %eax,-12(%ebp)
addl %eax,%eax
jle foo
rrmovl %edx,%eax
foo:
Problem 5.14 (Loop Unrolling)
Problem 5.19 (Performance & Optimization)
Practice Problem 6.3 (Disk Access Time)
Practice Problem 6.5 (Spatial Locality)
Practice Problem 6.6 (Cache Design)
Practice Problems 6.15,16,17 (also in Lecture Notes)
Practice Problems 6.18,19 (Memory Mountain)