CSCI 237

Computer Organization

Home | Lectures | Labs | CS@Williams

Lab 3: Extending the Y86-64 Simulator

Assigned Mar 6/7, 2024
Prelim Due Date Mar 15 at 11:59pm. Part A must be completed. Make sure you submit (using submit237) working versions of sum.ys, rsum.ys, and copy.ys.
Final Due Date Apr 5 at 11:59pm. All three parts must be completed. Make sure you submit (using submit237) seq-full.hcl and pipe-nt.hcl (in addition to final versions of sum.ys, rsum.ys, and copy.ys in Part A). As a reminder, we have a midterm in lab on Mar 13/14th, and spring break after that. Please manage your time on this lab carefully!
Files lab3.tar
Submissions Submit your solutions using
submit237 3 sum.ys rsum.ys copy.ys seq-full.hcl pipe-nt.hcl.

Overview

In this lab, you will learn about the design and implementation of a Y86-64 processor. The lab is organized into three parts. In Part A you will write some simple Y86-64 programs and become familiar with the Y86-64 tools. In Part B, you will extend the SEQ simulator with a new instruction. In Part C, you will modify the branch prediction algorithm used in PIPE.

Instructions

To fetch the source files for this lab, right click on lab3.tar and choose "Save As" to download the tarball to your local directory. You may want to move this file to a more desirable location using the Unix mv command. Alternatively, if you are using SSH to work remotely, you may want to use wget to fetch the file. Once you have used cd to navigate to the desired directory on a lab machine, you can use the following command to fetch the tarball:

         $ wget http://dept.cs.williams.edu/~jeannie/cs237/labs/lab3/lab3.tar 

Extract the contents using the following command:

         $ tar xvf lab3.tar    

This will cause the following files to be unpacked into the lab3 directory: README, Makefile, sim.tar, and simguide.pdf. Next, type the commands:

         $ cd lab3
         $ tar xvf sim.tar 

This will create the directory sim, which contains your personal copy of the Y86-64 tools. You will be doing all of your work inside this directory. Finally, change to the sim directory and build the Y86-64 tools:

         $ cd sim
         $ make clean; make 

Part A: Programming in Y86-64

You will be working in directory sim/misc in this part. Your task is to write and simulate the following three Y86-64 programs. The required behavior of these programs is defined by the functions in examples.c. Be sure to put your name in a comment at the beginning of each program. You can test your programs by first assemblying them with the program YAS (for example, ./yas sum.ys, and then running them with the instruction set simulator YIS (for example, ./yis sum.yo). If you forgot how to use these, reviewing these slides may be helpful.

In all of your Y86-64 functions, you should follow the x86-64 conventions for passing function arguments, using registers, and using the stack. This includes saving and restoring any callee-save registers that you use.

For reference, here is the len.ys example from class. Also, here is a nice web-based graphical Y86-64 simulator.

Program 1: Iteratively sum linked list elements (sum.ys)

Write a Y86-64 program sum.ys that iteratively sums the elements of a linked list. Your program should consist of some code that sets up the stack structure, invokes a function, and then halts. In this case, the function should be Y86-64 code for a function (sum_list) that is functionally equivalent to the C sum_list function shown in examples.c. Note that you do not need to worry about a Main function as we did in our example in class, but you do need to make sure the address of your array is being passed to your function. Test your program using the following three-element list:

	# Sample linked list
	.align 8
	ele1:
		.quad 0x00a
		.quad ele2
	ele2:
		.quad 0x0b0
		.quad ele3
	ele3:
		.quad 0xc00
		.quad 0
	

Program 2: Recursively sum linked list elements (rsum.ys)

Write a Y86-64 program rsum.ys that recursively sums the elements of a linked list. This code should be similar to the code in sum.ys, except that it should use a function rsum_list that recursively sums a list of numbers, as shown in the C function rsum_list. Test your program using the same three-element list you used for testing sum.ys.

Program 3: Copy a source block to a destination block (copy.ys)

Write a program copy.ys that copies a block of words from one part of memory to another (non-overlapping) area of memory, computing the checksum (xor) of all the words copied. Your program should consist of code that sets up a stack frame, invokes a function copy_block, and then halts. The function should be functionally equivalent to the C function copy_block. Test your program using the following three-element source and destination blocks:

	
	.align 8
	# Source block
	src:
		.quad 0x00a
		.quad 0x0b0
		.quad 0xc00
	# Destination block
	dest:
		.quad 0x111
		.quad 0x222
		.quad 0x333
	

For reference, here is sim/misc/examples.c:

	1  /* linked list element */
	2  typedef struct ELE {
	3  	 long val;
	4 	 struct ELE *next;
	5  } *list_ptr;
	6
	7  /* sum_list - Sum the elements of a linked list */
	8  long sum_list(list_ptr ls)
	9  {
	10 	 long val = 0;
	11 	 while (ls) {
	12 		val += ls->val;
	13 		ls = ls->next;
	14 	 }
	15 	 return val;
	16  }
	17
	18  /* rsum_list - Recursive version of sum_list */
	19  long rsum_list(list_ptr ls)
	20  {
	21 	  if (!ls)
	22 		return 0;
	23 	  else {
	24 		long val = ls->val;
	25 		long rest = rsum_list(ls->next);
	26 		return val + rest;
	27 	  }
	28  }
	29
	30  /* copy_block - Copy src to dest and return xor checksum of src */
	31  long copy_block(long *src, long *dest, long len)
	32  {
	33 	  long result = 0;
	34 	  while (len > 0) {
	35 		long val = *src++;
	36 		*dest++ = val;
	37 		result ^= val;
	38 		len--;
	39 	  }
	40 	  return result;
	41  }
	

Part B: Extending the SEQ Processor

You will be working in directory sim/seq in this part. Your task in Part B is to extend the SEQ processor to support the iaddq, described in Homework problems 4.51 and 4.52. To add this instruction, you will modify the file seq-full.hcl, which implements the version of SEQ described in the CSAPP textbook. In addition, it contains declarations of some constants that you will need for your solution.

Your HCL file must begin with a header comment containing the following information:

Building and Testing Your Solution

Once you have finished modifying the seq-full.hcl file, then you will need to build a new instance of the SEQ simulator (ssim) based on this modified HCL file, and then test it. This process is described below.

Part C: Modifying the PIPE Processor

Your task in Part C is to modify the PIPE processor as described in Homework problem 4.55. The current PIPE processor predicts that all branches will be taken, which basically means that the conditions of if statements are assumed to be true. Note that when processors predict incorrectly, the pipeline must corrected by bubbling/stalling. In Part C, you will modify PIPE so it predicts that conditional branches will not be taken. You will modify the file pipe-nt.hcl in the sim/pipe directory. There are comments in the file that start with "BNT" that will help.

NOTE: The PIPE diagram (Fig 4.52) in the textbook is incorrect. There should be an additional connection between memory value E (M_valE) and the logic for electing the next PC (Select PC). Luckily the pipe-nt.hcl source code correctly includes this connection; only the diagram is incorrect.

Your modified file should begin with a header comment with the following information:

If you need some hints, the following list describes what you need to do. If you want a challenge, skip this section!

  1. Fetch: f_pc is the incoming PC signal used to fetch an instruction from instruction memory. It has logic for a mis-predicted branch based on the default all-branches-taken logic: M_icode == I_JXX (we executed a branch 3 cycles ago) and !M_Cnd (and we shouldn't have done so). It reads from the M register outputs because that's where the results of the condition code check is known. Change it so a mis-prediction is when we executed a conditional jump (ifun was not UNCOND) and we should have taken the branch (instead of should not have taken the branch). If there was a mis-prediction, get the new address from the M register you stored it in during the previous step. (Note: There is no BNT comment for this change. Look for f_pc in Fetch.)

  2. Fetch: f_predPC is the predicted PC. It predicts valC (the immediate value) for all branches right now. Change it so it predicts valP instead of valC for conditional jumps. You should still predict valC for unconditional jumps.

  3. Execute: Misprediction is detected after the memory stage, so we need to make sure the correct destination is available when we detect misprediction. That means getting valC (the immediate value) up to the M pipeline register. valC already makes it to the E pipeline register; we just need to get it through the execute phase. Hint: Think about running valC through the ALU. What should aluA and aluB be if we simply want to compute valC + 0?

  4. Pipeline Register Control: Misprediction means we have to "bubble" the pipeline. We notice misprediction in the memory stage, and correct it in the fetch stage, meaning decode and execute were doing the wrong actions and need to be erased (bubbled). Luckily, most of the code for the bubbling is already in place; we just need to update the logic for a misprediction. You'll see logic much like the f_pc misprediction logic in D_bubble and E_bubble. Change it to match your changed f_pc misprediction case. (Note: There is no BNT comment for this change.)

Building and Testing Your Solution

Once you have finished modifying the pipe-nt.hcl file, then you will need to build a new instance of the PIPE simulator (psim) based on this modified HCL file, and then test it. This process is described below.

Resources