Computer Organization Midterm (2015/11/17)twins.ee.nctu.edu.tw/courses/co_16/CO 2015...

Computer Organization

Midterm (2015/11/17)

1. (10%) Assume a program requires the execution of 1106 FP instructions, 110106

INT instructions, 280106 L/S instructions, and 200106 branch instructions. The CPI

for each of FP, INT, L/S, and branch instructions is 20, 1, 4, and 2, respectively.

Assume that the processor has a 2 GHz clock rate.

(a) By how much must we improve the CPI of FP instructions if we want the

program to run two time faster?

(b) By how much is the execution time of the program improved if the CPI of INT

and FP instructions is reduced by 40% and the CPI of L/S and branch is

reduced by 30%?

Ans

(a)

0FPCPI , 不可能

(b)

FP INT L/S Branch

IC 106 110106 280106 200106

CPI 20 1 4 2

Clock rate 2 GHz

FP INT L/S Branch

IC 106 110106 280106 200106

CPI 20→12 1→0.6 4→2.8 2→1.4

Clock rate 2 GHz

6

( )

6

( )

_ 1 20 110 1 280 4 200 2 102

_ 1 1 110 1 280 4 200 2 10

old

new FP

Ex time

Ex time CPI

6

( )

6

( )

_ 1 20 110 1 280 4 200 2 101.44

_ 1 12 110 0.6 280 2.8 200 1.4 10

old

new

Ex time

Ex time

評分標準：

(b)小題題目有要求改善多少(By how much), 只算出Extime酌扣2分

2. (10%) Consider two different implementations of the same ISA. Assume there are

four types of instructions: class A, B, C, and D with different CPIs.

CPI

clock rate A B C D

P1 2.5 GHz 1 2 3 3

P2 3 GHz 2 2 2 2

Given a program with a dynamic instruction count of 106 instructions divided into

classes as follows 10% class A, 20% class B, 50% class C, and 20% class D. What is

the global CPI for each implementation? And, please find the clock cycles required in

both cases.

Ans

(a) CPI(P1)=2.6

CPI(P2)=2

(b) clock cycle(P1) = 526 10

clock cycle(P2) = 520 10

評分標準:

式子列對答案錯：扣2分

clock cycles 算成 clock period : 扣2分

3. (10%) When a program is adapted to run on multiple processors in a multiprocessor

system, the execution time on each processor is comprised of computing time and the

overhead time required for locked critical sections and/or to send data from one

processor to another. Assume a program requires t=100 s of execution time on one

processor. When run p processors, each processor requires t/p s, as well as an

additional 4 s overhead, irrespective of the number of processors. Compute the

per-processor execution time for 4, 16, and 64 processors. For each case, list the

corresponding speedup relative to a single processor and the ratio between actual

speedup versus ideal speedup (i.e., the speedup if there was no overhead)

Ans

ratio比例求反了：酌扣2分

4. (10%) Consider the following MIPS loop:

LOOP: slt $t2, $0, $t1

beq $t2, $0, DONE

subi $t1, $t1, 1

addi $s2, $s2, 2

j LOOP

DONE:

(a) (5%) Please write the equivalent C code routine. Assume that the registers

$s1, $s2, $t1, and $t2 are integers A, B, i, and temp, respectively.

(b) (5%) For the loop written in MIPS assembly above, assume that the register

$t1 is initialized to the value N. How many MIPS instructions are executed?

Ans

(a)

do {

B+=2;

i=i-1;

}while (i>0)

(b) 5*N

5. (10%) Consider the following C code:

A[300] = g + h + A[12];

(a) (5%) A is an array of 1000 32-bit words. Let’s assume that the compiler has

associated the variables g and h with registers $s1 and $s2 and that the

starting address of an array A is in $s3. Please compile the C code

(b) (5%) What are the MIPS machine language codes (in hexadecimal

representation) for the instructions of your answer in (a)

Ans

(a)

lw $t0, 48($s3)

add $t0, $t0, $s2

add $t0, $t0, $s1

sw $t0, 1200($s3)

(b)

lw $t0, 48($s3)

add $t0, $t0, $s2

add $t0, $t0, $s1

sw $t0, 1200($s3)

lw $t0, 48($s3)

add $t0, $t0, $s2

add $t0, $t0, $s1

sw $t0, 1200($s3)

op rs rt rd sh/addr funct

35 19 8 48

0 8 8 18 0 32

0 8 8 17 0 32

43 19 8 1200

op rs rt rd sh/addr funct

100011 10011 01000 0000 0000 0011 0000

000000 01000 01000 10010 00000 100000

000000 01000 01000 10001 00000 100000

101011 10011 01000 0000 0100 1011 0000

6. (15%) Implement the following C code in MIPS assembly.

int fib(int n){

if (n==0) return 0;

else if (n==1) return 1;

else return fib(n-1) + fib(n-2);}

Ans

評分標準：

每個block code兩分，少一行function扣1分

7. (5%) Please load a 32-bit constant: 0000 0000 0001 1101 0000 1000 0000 0000 into

register $s0 using MIPS assembly code?

Ans

lui $s0, 2910(0x1D)

ori $s0, $s0, 204810(0x800)

8. (15%) In the following problems, the data table contains the values for registers $t0

and $t1. You will be asked to perform several MIPS logical operations on these

registers.

Case a. $t0 = 0x55555555, $t1 = 0x12345678

Case b. $t0 = 0xBEADFEED, $t1 = 0xDEADFADE

For the cases above, what are the values of $t2 for the following sequence of

instructions:

(a) sll $t2, $t0, 4

or $t2, $t2, $t1

(b) sll $t2, $t0, 4

andi $t2, $t2, -1

(c) srl $t2, $t0, 3

andi $t2, $t2, 0xFFEF

Ans

(a)

binary HEX

Case a 0101_0111_0111_0101_0101_0111_0111_1000 5775 5778

Case b 1111_1110_1111_1111_1111_1110_1101_1110 FEFF FEDE

(b)

binary HEX

Case a 0101_0101_0101_0101_0101_0101_0101_0000 5555 5550

Case b 1110_1010_1101_1111_1110_1110_1101_0000 EADF EED0

(c)

binary HEX

Case a 0000_1010_1010_1010_1010_1010_1010_1010 0AAA AAAA

Case b 0001_0111_1101_0101_1011_1111_1100_1101 17D5 BFCD

1.(a)

0101_0101_0101_0101_0101_0101_0101_0000

0001_0010_0011_0100_0101_0110_0111_1000

0101_0111_0111_0101_0101_0111_0111_1000

1.(b)

1110_1010_1101_1111_1110_1110_1101_0000

1101_1110_1010_1101_1111_1010_1101_1110

1111_1110_1111_1111_1111_1110_1101_1110

2.(a)

0101_0101_0101_0101_0101_0101_0101_0000

1111_1111_1111_1111_1111_1111_1111_1111

0101_0101_0101_0101_0101_0101_0101_0000

2.(b)

1110_1010_1101_1111_1110_1110_1101_0000

1111_1111_1111_1111_1111_1111_1111_1111

1110_1010_1101_1111_1110_1110_1101_0000

3.(a)

0101_0101_0101_0101_0101_0101_0101_0101(原數)

0000_1010_1010_1010_1010_1010_1010_1010(>>3)

1111_1111_1111_1111_1111_1111_1110_1111(FFEF)

0000_1010_1010_1010_1010_1010_1010_1010

3.(b)

1011_1110_1010_1101_1111_1110_1110_1101(原數)

0001_0111_1101_0101_1011_1111_1101_1101(>>3)

1111_1111_1111_1111_1111_1111_1110_1111(FFEF)

0001_0111_1101_0101_1011_1111_1100_1101

9. (10%)

(a) (5%) Assume 151 and 214 are signed 8-bit decimal integers stored in two’s

complement format. Calculate 151+214 using saturating arithmetic. The result

should be written in decimal. Show your work.

(b) (5%) Assume 151 and 214 are unsigned 8-bit decimal integers stored in two’s

complement format. Calculate 151+214 using saturating arithmetic. The result

should be written in decimal. Show your work.

Ans

(a) 105 42 128 ( 147)

(b) 151 214 255 (365)

10. (10%)

(a) (5%) Write down the binary representation of the decimal number 63.25

assuming the IEEE 754 single precision format.

(b) (5%) What decimal number does the bit pattern 0x0CC000FF represent if it is

a floating point number? Using the IEEE 754 standard.

Ans

(a) 00 201.1111111025.63

normalize, move binary point 5 to the left

521111101.1

sign=positive, exp=127+5=132

final bit pattern : 0_1000_0100_1111_1010_0000_0000_0000_000

(b) 0x0CC000FF=0000_1100_1100_0000_0000_0000_1111_1111

=0_00011001_10000000000000011111111

sign=positive, exp = 25-127 = -102

final decimal number = 1.10000000000000011111111 1022

11. (5%) Please design the overflow detecting logic for the following 4-bit ALU

1-bit ALU

1-bit ALU

1-bit ALU

1-bit ALU

A0

B0

A1

B1

A2

B2

A3

B3

CarryIn

Carry1

Carry2

Carry3

CarryOut

Result0

Result1

Result2

Result3

Ans

o v e r f l o w

Good Luck!!

Computer Organization Midterm (2015/11/17)twins.ee.nctu.edu.tw/courses/co_16/CO 2015...

Documents

Transcript of Computer Organization Midterm (2015/11/17)twins.ee.nctu.edu.tw/courses/co_16/CO 2015...