SNS COLLEGE OF ENGINEERING I... · COURSE: B.E Electronics and Communication Engineering CS6303...

of 13

ANSWERKEY FOR INTERNAL ASSESMENT EXAMINATIONS - I

COURSE: B.E Electronics and Communication Engineering CS6303 Computer Architecture

PART A - (9 X 2 = 18 marks) 1.Amdahl’s Law

It gives the theoretical speedup in latency of the execution of a task at fixed workload that

can be expected of a system whose resources are improved.

Amdahl's law can be formulated the following way:

Where

Slatency is the theoretical speedup in latency of the execution of the whole task;

s is the speedup in latency of the execution of the part of the task that benefits from

the improvement of the resources of the system;

p is the percentage of the execution time of the whole task concerning the part that

benefits from the improvement of the resources of the system before the

improvement.

2.If computer A runs a program in 10 seconds and computer B runs the same program

in 15 seconds, how much faster is A than B

Solution: 1.5 times faster

3.Little Endian and Big Endian

Big endian are systems in which the Most Significant byte of the word is stored in the

smallest address given and the least significant byte is stored in largest Address Value 1000

90 1001 AB 1002 12 1003 CD

Little endian are systems are those in which the least significant byte is stored in smallest

address Value 1000 CD 1001 12 1002 AB 1003 90

4.Uniprocessor and Multiprocessor

SNS COLLEGE OF ENGINEERING Kurumbapalayam(Po), Coimbatore – 641 107

Accredited by NAAC-UGC with ‘A’ Grade Approved by AICTE & Affiliated to Anna University, Chennai

Uniprocessors Multiprocessors Uses single

processing unit Uses two or more processing

unit Used at home and office

Not used for small application Cost is less More Expensive

One processing Core More than one processing core

of 13

5.Difference between memory and register

Memory Register Memory must be slower than registers Registers are faster than memory

Data Accesses is slow Data Accesses is fast

Take more time to access the data Take less time to access the data

Less throughput High throughput than memory

6.Instruction for PC relative and Register Offset pre indexed addressing mode

PC Relative: branch address is the sum of the PC and a constant in the instruction

Example : BEQ 1000

Register Offset pre indexed addressing mode

Example: LD r2, [r0, r1]!

7.Various operations and components in ALU

Operations

Addition

Subtraction

Multiplication

Division

AND

Task cannot be divided Task division can take place

of 13

OR

NOT

Components

AND

OR

NOT

Multiplier

8.Overflow in subtraction

Overflow occurs when there are insufficient bits in a binary number representation to portray

the result of an arithmetic operation. Overflow occurs because computer arithmetic is not

closed with respect to addition, subtraction, multiplication, or division. Overflow cannot

occur in addition (subtraction),

If the operands have different (resp. identical) signs. Overflow if result out of range

Subtracting two +ve or two –ve operands, no overflow

Subtracting +ve from –ve operand Overflow if result sign is 0

Subtracting –ve from +ve operand • Overflow if result sign is 1

9.Subtract 11001 from 11100 using 2’s Complement Method

PART B - (3 X 16 = 48 marks) 1.a.i. Eight great ideas of computer Architecture

1. Design for Moore's Law

The one constant for computer designers is rapid change, which is driven largely by Moore's

Law. It states that integrated circuit resources double every 18–24 months. Moore's Law

resulted from a 1965 prediction of such growth in IC capacity made by Gordon Moore, one

of the founders of Intel. As computer designs can take years, the resources available per chip

can easily double or quadruple between the start and finish of the project. Like a skeet

1 1 1 0 0

- 0 0 1 1 1

1 0 0 0 1 1

0 0 0 1 1

of 13

shooter, computer architects must anticipate where the technology will be when the design

finishes rather than design for where it starts. We use an "up and to the right" Moore's Law

graph to represent designing for rapid change.

2. Use Abstraction to Simplify Design

Both computer architects and programmers had to invent techniques to make themselves

more productive, for otherwise design time would lengthen as dramatically as resources grew

by Moore's Law. A major productivity technique for hardware and soft ware is to use

abstractions to represent the design at different levels of representation; lower-level details

are hidden to off er a simpler model at higher levels. We'll use the abstract painting icon to

represent this second great idea.

3. Make the common case fast

Making the common case fast will tend to enhance performance better than optimizing the

rare case. Ironically, the common case is oft en simpler than the rare case and hence is oft en

easier to enhance. This common sense advice implies that you know what the common case

is, which is only possible with careful experimentation and measurement. We use a sports car

as the icon for making the common case fast, as the most common trip has one or two

passengers, and it's surely easier to make a fast sports car than a fast minivan.

4. Performance via parallelism

Since the dawn of computing, computer architects have offered designs that get more

performance by performing operations in parallel. We'll see many examples of parallelism in

this book. We use multiple jet engines of a plane as our icon for parallel performance.

5. Performance via pipelining

A particular pattern of parallelism is so prevalent in computer architecture that it merits its

own name: pipelining. For example, before fire engines, a "bucket brigade" would respond to

a fire, which many cowboy movies show in response to a dastardly act by the villain. Th e

townsfolk form a human chain to carry a water source to fi re, as they could much more

quickly move buckets up the chain instead of individuals running back and forth. Our

pipeline icon is a sequence of pipes, with each section representing one stage of the pipeline.

6. Performance via prediction

Following the saying that it can be better to ask for forgiveness than to ask for permission, the

next great idea is prediction. In some cases it can be faster on average to guess and start

working rather than wait until you know for sure, assuming that the mechanism to recover

from a misprediction is not too expensive and your prediction is relatively accurate. We use

the fortune-teller's crystal ball as our prediction icon.

of 13

7. Hierarchy of memories

Programmers want memory to be fast, large, and cheap, as memory speed often shapes

performance, capacity limits the size of problems that can be solved, and the cost of memory

today is often the majority of computer cost. Architects have found that they can address

these conflicting demands with a hierarchy of memories, with the fastest, smallest, and most

expensive memory per bit at the top of the hierarchy and the slowest, largest, and cheapest

per bit at the bottom. Caches give the programmer the illusion that main memory is nearly as

fast as the top of the hierarchy and nearly as big and cheap as the bottom of the hierarchy. We

use a layered triangle icon to represent the memory hierarchy. The shape indicates speed,

cost, and size: the closer to the top, the faster and more expensive per bit the memory; the

wider the base of the layer, the bigger the memory.

8. Dependability via redundancy

Computers not only need to be fast; they need to be dependable. Since any physical device

can fail, we make systems dependable by including redundant components that can take over

when a failure occurs and to help detect failures. We use the tractor-trailer as our icon, since

the dual tires on each side of its rear axles allow the truck to continue driving even when one

tire fails. (Presumably, the truck driver heads immediately to a repair facility so the fl at tire

can be fixed, thereby restoring redundancy!)

1.a.ii. Consider a computer with three instruction classes and CPI measurements are given below and instructions counts for each instructions class for the same program from different compilers are given

CPI CPI FOR EACH INSTRUCTONS CLASS A B C 1 2 3

CPI CPI FOR EACH INSTRUCTONS CLASS A B C

Code 1 2 1 2 Code 2 4 1 1

A. Which code sequence executes the more No.of instructions?

Sequence 1 executes 2 + 1 + 2 = 5 instructions. Sequence 2 executes 4 + 1 + 1 = 6

instructions. Therefore, sequence 1 executes fewer instructions.

B. Which code sequence will be faster?

CPU clock cycles based on instruction count and CPI to find the total number of

clock cycles for each sequence

of 13

CPU clock cycles1 = (2 × 1) + (1 × 2) + (2 × 3) = 2 + 2 + 6 = 10 cycles

CPU clock cycles2 = (4 × 1) + (1 × 2) + (1 × 3) = 4 + 2 + 3 = 9 cycles

Code sequence 2 is faster, even though it executes one extra instruction. Since code sequence 2 takes fewer overall clock cycles but has more instructions, it must have a lower CPI.

C. What is the CPI for each sequence?

The CPI values can be computed by

1.b.i. Operations and operands of computer Hardware

Instruction

Performance of H/W – Know the programming language

The word computer language called Instruction and its Vocabulary called

instruction Sets.

Based on the programming language & hardware performance can be varied.

Instruction set – describe the function of architecture.

MIPS technologies and three popular instruction sets

ARM V7 – similar to MIPS. More than 9 billion chips in ARM Processor –

Manufactured in 2011

INTEL X86 – PC and Cloud

ARM V8 – Addressing size from 32 to 64 bits

Computer designer – find outs the language through that Maximizing the performance,

Minimizing the Cost and Energy.

of 13

Operation

ARM Assembly Language

Operands

of 13

Memory Operands Example: Assignment statement is g= h + A[8]; A[12]= h + A[8];

lw $t0, 32($s3) # load word

add $t0, $s2, $t0

sw $t0, 48($s3) # store word

Big Endian and Little Endian

Constant or Immediate Operands

uses less energy

Perform operations in more fast

Example: addi s3,s3,4; s3 = s3+4

1.b.ii. Various operations in logical and control in computer

It is an order given to a computer processor by a computer program

To perform some task and it can be varied from one language to another language

Register numbers – MIPS Assembly Language

$t0 – $t7 are reg’s 8 – 15

$t8 – $t9 are reg’s 24 – 25

$s0 – $s7 are reg’s 16 – 23

of 13

MIPS R-Format Instructions

Instruction fields

– op: operation code (opcode)

– rs: first source register number

– rt: second source register number

– rd: destination register number

– shamt: shift amount (00000 for now)

– funct: function code (extends opcode)

Translating ARM Assembly Instructions into Machine Instructions

add $t0, $s1, $s2

000000100011001001000000001000002 = 0232402016 MIPS I-format Instructions \ Logical Operation

Operation C Java MIPS Shift left << << sll

Shift right >> >>> srl

Bitwise AND & & and, andi

Bitwise OR | | or, ori Bitwise NOT ~ ~ nor

Control Operation

Branch to a labeled instruction if a condition is true

Otherwise, continue sequentially

beq rs, rt, L1 :if (rs == rt) branch to instruction labeled L1;

of 13

bne rs, rt, L1: if (rs != rt) branch to instruction labeled L1;

j L1 :unconditional jump to instruction labeled L1

2. a. i. components of computer System

2.b.i. Addressing modes

Way in which the operand of an instruction is specified.

Information contained in the instruction code is the

value of the operand or

the address of the result/operand

Multiple forms of addressing are generically called addressing modes

MIPS Instruction format

of 13

2.b.ii. various measures of performance of a computer

Relative Performance

Measuring Execution Time

Elapsed time : Total response time - Determines system performance

Example : Processing, I/O, OS overhead, idle time

CPU time: Time spent processing a given job . Comprises user CPU time and system CPU

time

n XY

YX

time Executiontime ExecutionePerformancePerformanc

of 13

CPU Performance & its Factor

RateClock CyclesClock CPU

Time CycleClock CyclesClock CPUTimeExecution CPU

Performance improved by

– Reducing number of clock cycles

– Increasing clock rate

– Hardware designer must often trade off clock rate against cycle count

Instruction Performance

nInstructioper CyclesCountn InstructioCyclesClock CPU Classical CPU Performance Equation

Rate ClockCPICount nInstructio

Time Cycle ClockCPICount nInstructioTime CPUnInstructio per CyclesCount nInstructioCycles Clock

Performance

Time Cycle ClockCPICount nInstructioTime CPU

cycle ClockSeconds

nInstructiocycles Clock

ProgramnsInstructioTime CPU s

Performance depends on

– Algorithm: affects IC, possibly CPI

– Programming language: affects IC, CPI

– Compiler: affects IC, CPI

– Instruction set architecture: affects IC, CPI, Tc

3.a.i. Addition and subtraction in ALU Addition

Direct method with example overflow

Subtraction Direct method with example 1’s Complement with example 2’s Complement with example overflow

of 13

3.b.i.booth multiplication Algorithm

*****

SNS COLLEGE OF ENGINEERING I... · COURSE: B.E Electronics and Communication Engineering CS6303...

Documents

Transcript of SNS COLLEGE OF ENGINEERING I... · COURSE: B.E Electronics and Communication Engineering CS6303...