K. Olukotun                                                         Handout #
Fall 98/99                                                          EE282H

EE282H Programming Assignment #2

Pipelined MIPS-Lite Verilog Model
Due: Tuesday , Nov 10, 1998

1.0 Overview

The purpose of this assignment is to familiarize you with basic pipelining, hazards, and interlocks. You are required to pipeline the verilog MIPS-Lite model. A working model that executes instructions once every 5 cycles as in assignment #1 (but with some changes to make the model easier to pipeline) will be provided. You are expected to work in groups of two people. Make sure to check the class web page http://www-leland.stanford.edu/class/ee282h/ (and possibly the newsgroup) regularly for extra information regarding this assignment.


2.0 What To Turn In

As with Programming Assignment #1, everything will be submitted electronically. Follow the instructions given in Programming Assignment #1. You should submit:


3.0 Getting Started

To do this assignment, we recommend following the steps listed below.

3.1 Implement the PC, IR and RD chains.

In the model you are given, the logic to calculate RD has already been moved from the WB stage to the ID stage and the variables for use in creating the chains have already been declared. Change the control so that a new instruction is fetched every cycle and entered into the pipeline. This will involve removing the finite state machine used in Programming Assignment #1.

To implement the state chains, a parameterized master-slave flip flop name propagate is included in the ff.v file. Consequently, the chains would be physically realized via multiple instantiations of these flops in your code. The stall control line on each latch forces the flop to hold its current state while the line is true.

Since the flop is parameterized, you can use it to specify various width parts in the following manner:

As you can see, the width parameter is specified as #(width). For more info, check out ff.v.

After completing this step, you should have a functional pipelined model that does not interlock or bypass results. Without bypassing, the sw in the code sequence:

add r4, r3, r2
sw r4, 4(r29)
will store the old value of r4rather than the value just calculated in the add. Because there is no interlocking either, your model will have one load delay slot and one branch delay slot. Follow the instructions in the testing section below to test that your model is behaving properly at this stage. Make sure you verify that your model works as expected after each step before continuing!

3.2 Implement bypassing.

You must eliminate all the data hazards that can be eliminated with bypassing. Bypassing is required whenever an instruction requires a value from a register which an instruction further down in the pipeline has changed but not written back to the register file.

At this point, your model can be tested again. Again, follow the instructions in the testing section and verify that your model works as expected before continuing.

3.3 Implement interlocks.

Interlocks will eliminate the need for load and branch delay slots. To eliminate the branch delay slot you should use a predict not taken scheme. When a branch is in the ID stage, the instruction at PC+4 has already been fetched and is sitting in the IF stage. If the branch is taken, this next instruction must be squashed before it enters the ID stage. You can do this simply by setting IR2 to a nop (32 zeros).

Load interlocks occur when a load is immediately followed by a use. Take the following code sequence:

lw r2, 4(r29) 
add r4, r3, r2
When the add is in the ID stage you must detect whether or not one of its operands will be generated by the load currently in the EX stage. If this is the case you must stall the add instruction and force a nop into the IR3 stage.

Caveat! This could turn into a very difficult assignment if you make more changes to the model than necessary. The assignment can be completed by making changes to cpu.v and adding two more modules for the ID stage and EX stage bypassing. If you have to make any changes to the other modules in the model, they should be trivial in nature.


4.0 Testing your model

As you complete the steps described above, it is possible to test the model to make certain that it is behaving properly. In the testcode directory, you will find the following test programs:

Please note that these programs do not test all cases of bypassing and interlocking. However, you are expected to implement all cases in your model.

4.1 Testing the Pipeline

After completing steps 3.1 and 3.2 from above, you should have a pipelined model with no bypassing or interlocks. This can be tested with the program add_nop.s. Compile it by typing:

compile282h add_nop.s 
add_nop.s is identical to the add.s program except that it has nops to avoid all pipeline hazards. Run this program through the verilog simulator and check that instructions are being fetched every cycle, and that the correct results are being produced.

4.2 Testing Bypassing

Once you have implemented bypassing, you should be able to run all the test programs. However, the programs must be compiled with a special flag set. For example, to compile the add.s program type:

compile282h add.s delay 
The delay flag will tell the assembler that there are load and branch delay slots. The assembler will automatically try to fill the delay slots with a useful instruction. If it cannot fill a delay slot with a useful instruction, it will insert a nop thereby avoiding the need for an interlock. By using the delay switch on the compile282h script files you can fully test your bypassing logic.

4.3 Testing Interlocks

You should now compile your programs without the delay flag. The assembler will no longer fill the delay slots and instead will rely on your model to interlock correctly. First get the simple interlock.s program working and then try the bubble and quick programs to give your code a more rigorous test. We also recommend that you write and run some tests of your own, so that you can verify that cases not covered by the programs that we provide work as well! You should identify all possible bypassing paths and interlocks, and verify that each has been implemented correctly.


5.0 MIPS-Lite Debugging Utilities

This information is available online.


6.0 Details

6.1 Setup

The files needed for this programming assignment are located in /usr/class/ee282h/p2. As in the previous assignment, you will find two subdirectories:

To make your own copy of the p2 directory structure and all the files, change to your own ee282h directory and type:
cp -r /usr/class/ee282h/p2 .
6.2 Compiling Test Programs

In order to compile test programs, login to one of the SGIs (firebirds and raptors). The SGIs use a MIPS processor so the C compiler will generate the correct object code for our Verilog model. Do not try to compile test programs on one of the SPARC workstations. If you ever get tons of errors when compiling a test program, the first thing you should check is that you are logged on to a SGI.

Once you are logged onto a SGI, go to the testcode directory. In the testcode directory there are several sample test programs as well as two scripts which compile the test programs for you.

For this assignment it is likely that you will want to write your own test programs. In order to write test programs, it is important to understand how the memory system of the Verilog model works. In the MIPS-Lite model, memory is a simple array of 0x4000 32-bit words. When Verilog is started, the model searches for two files in the verilog directory: text_segment and data_segment. The MIPS-Lite model initializes the memory array with the text segment starting at location zero and the data segment immediately following.

The model then forces the PC to address 0xffe4 near the end of memory (defined as the reset vector). From this location, the model executes a jump and link (jal) to location 0x0000 and begins executing the user program. At the end of the user code there must be a jr 31 instruction which returns to the reset vector. This will signal the simulation to stop.

6.2.1 Compiling C test programs

Use the handy script compile282h located in the testcode directory. Given a C program bubble.c, compile the program by typing:

compile282H bubble.c 
The compile282h script will generate the following files: The bubble.text and bubble.data files are automatically copied from the testcode directory into the verilog directory as text_segment and data_segment, respectively. When writing a C program, make sure that the main() procedure is the first procedure in the file in order to ensure that it will be first in the text_segment. Also be sure to initialize all global data; otherwise the compiler will allocate the variable on the heap. Our script files are not set up to handle heap allocated data and so it will not be accessed correctly.

When compiling programs, you may get the following warning message:

as1: Warning: temp.s, line 20: nop required
You can ignore this warning message. You see this message because the MIPS R3000 has load and branch delay slots and our script file tells the MIPS assembler that it should ignore these delay slots. The message simply warns you that the code will not work on a MIPS R3000 unless you put a nop in the delay slot. Since our simulator does not have delay slots, you should ignore this warning.

Also, if you have problems running the compile282h command, or it complains about not being able to find loadcore.pl, you should check that your path is set properly.

(make sure you source /usr/class/ee282h/setup to put loadcore.pl in you path, and put ./ in your path or type ./compile282h)

6.2.2 Compiling MIPS assembly test programs

Use the same script file compile282h located in the testcode directory. Given a MIPS assembly program add.s compile the program by typing:

compile282h add.s 
When compiling assembly files, the compile282h script will produce the following files: Remember to put a jr 31 instruction at the end of you program so that the simulation will terminate correctly. When in doubt, follow the examples programs provided in the testcode directory.

6.2.3 Running previously compiled programs

If you want to rerun a test program which has already be compiled, it does not have to be recompiled. The *.data and *.text files from the testcode directory simply have to be copied into the verilog directory. To do this, a third script file called run has been provided. For example, if you wanted to simulate bubble.c in the verilog model and had previously compiled it using compile282h, then you would go to the testcode directory and type:

run bubble

6.3 Running the verilog MIPS-Lite model

To run the Verilog model change to the verilog directory. Type:

verilog -f master
This will compile all of the Verilog source files for the MIPS-Lite model. There are also three command line arguments which allow you to use other Verilog features: You can use any combination of these options. To use all three type:
verilog -f master +waves +regs +output

Once Verilog has started up, type . [return] and you will run the most recently compiled program. By clicking on the buttons of the graphical register display, it is possible to step through the execution of the program. You can see what is happening during every cycle and look at the values in all the registers. To exit Verilog, type [Ctrl-D].


7.0 MIPS-Lite Instruction Set Architecture Summary

This information is available online.