SNS COLLEGE OF ENGINEERING I... · COURSE: B.E Electronics and Communication Engineering CS6303...
Transcript of SNS COLLEGE OF ENGINEERING I... · COURSE: B.E Electronics and Communication Engineering CS6303...
Page 1 of 13
ANSWERKEY FOR INTERNAL ASSESMENT EXAMINATIONS - I
COURSE: B.E Electronics and Communication Engineering CS6303 Computer Architecture
PART A - (9 X 2 = 18 marks) 1.Amdahl’s Law
It gives the theoretical speedup in latency of the execution of a task at fixed workload that
can be expected of a system whose resources are improved.
Amdahl's law can be formulated the following way:
Where
Slatency is the theoretical speedup in latency of the execution of the whole task;
s is the speedup in latency of the execution of the part of the task that benefits from
the improvement of the resources of the system;
p is the percentage of the execution time of the whole task concerning the part that
benefits from the improvement of the resources of the system before the
improvement.
2.If computer A runs a program in 10 seconds and computer B runs the same program
in 15 seconds, how much faster is A than B
Solution: 1.5 times faster
3.Little Endian and Big Endian
Big endian are systems in which the Most Significant byte of the word is stored in the
smallest address given and the least significant byte is stored in largest Address Value 1000
90 1001 AB 1002 12 1003 CD
Little endian are systems are those in which the least significant byte is stored in smallest
address Value 1000 CD 1001 12 1002 AB 1003 90
4.Uniprocessor and Multiprocessor
SNS COLLEGE OF ENGINEERING Kurumbapalayam(Po), Coimbatore – 641 107
Accredited by NAAC-UGC with ‘A’ Grade Approved by AICTE & Affiliated to Anna University, Chennai
Uniprocessors Multiprocessors Uses single
processing unit Uses two or more processing
unit Used at home and office
Not used for small application Cost is less More Expensive
One processing Core More than one processing core
Page 2 of 13
5.Difference between memory and register
Memory Register Memory must be slower than registers Registers are faster than memory
Data Accesses is slow Data Accesses is fast
Take more time to access the data Take less time to access the data
Less throughput High throughput than memory
6.Instruction for PC relative and Register Offset pre indexed addressing mode
PC Relative: branch address is the sum of the PC and a constant in the instruction
Example : BEQ 1000
Register Offset pre indexed addressing mode
Example: LD r2, [r0, r1]!
7.Various operations and components in ALU
Operations
Addition
Subtraction
Multiplication
Division
AND
Task cannot be divided Task division can take place
Page 3 of 13
OR
NOT
Components
AND
OR
NOT
Multiplier
8.Overflow in subtraction
Overflow occurs when there are insufficient bits in a binary number representation to portray
the result of an arithmetic operation. Overflow occurs because computer arithmetic is not
closed with respect to addition, subtraction, multiplication, or division. Overflow cannot
occur in addition (subtraction),
If the operands have different (resp. identical) signs. Overflow if result out of range
Subtracting two +ve or two –ve operands, no overflow
Subtracting +ve from –ve operand Overflow if result sign is 0
Subtracting –ve from +ve operand • Overflow if result sign is 1
9.Subtract 11001 from 11100 using 2’s Complement Method
PART B - (3 X 16 = 48 marks) 1.a.i. Eight great ideas of computer Architecture
1. Design for Moore's Law
The one constant for computer designers is rapid change, which is driven largely by Moore's
Law. It states that integrated circuit resources double every 18–24 months. Moore's Law
resulted from a 1965 prediction of such growth in IC capacity made by Gordon Moore, one
of the founders of Intel. As computer designs can take years, the resources available per chip
can easily double or quadruple between the start and finish of the project. Like a skeet
1 1 1 0 0
- 0 0 1 1 1
1 0 0 0 1 1
0 0 0 1 1
Page 4 of 13
shooter, computer architects must anticipate where the technology will be when the design
finishes rather than design for where it starts. We use an "up and to the right" Moore's Law
graph to represent designing for rapid change.
2. Use Abstraction to Simplify Design
Both computer architects and programmers had to invent techniques to make themselves
more productive, for otherwise design time would lengthen as dramatically as resources grew
by Moore's Law. A major productivity technique for hardware and soft ware is to use
abstractions to represent the design at different levels of representation; lower-level details
are hidden to off er a simpler model at higher levels. We'll use the abstract painting icon to
represent this second great idea.
3. Make the common case fast
Making the common case fast will tend to enhance performance better than optimizing the
rare case. Ironically, the common case is oft en simpler than the rare case and hence is oft en
easier to enhance. This common sense advice implies that you know what the common case
is, which is only possible with careful experimentation and measurement. We use a sports car
as the icon for making the common case fast, as the most common trip has one or two
passengers, and it's surely easier to make a fast sports car than a fast minivan.
4. Performance via parallelism
Since the dawn of computing, computer architects have offered designs that get more
performance by performing operations in parallel. We'll see many examples of parallelism in
this book. We use multiple jet engines of a plane as our icon for parallel performance.
5. Performance via pipelining
A particular pattern of parallelism is so prevalent in computer architecture that it merits its
own name: pipelining. For example, before fire engines, a "bucket brigade" would respond to
a fire, which many cowboy movies show in response to a dastardly act by the villain. Th e
townsfolk form a human chain to carry a water source to fi re, as they could much more
quickly move buckets up the chain instead of individuals running back and forth. Our
pipeline icon is a sequence of pipes, with each section representing one stage of the pipeline.
6. Performance via prediction
Following the saying that it can be better to ask for forgiveness than to ask for permission, the
next great idea is prediction. In some cases it can be faster on average to guess and start
working rather than wait until you know for sure, assuming that the mechanism to recover
from a misprediction is not too expensive and your prediction is relatively accurate. We use
the fortune-teller's crystal ball as our prediction icon.
Page 5 of 13
7. Hierarchy of memories
Programmers want memory to be fast, large, and cheap, as memory speed often shapes
performance, capacity limits the size of problems that can be solved, and the cost of memory
today is often the majority of computer cost. Architects have found that they can address
these conflicting demands with a hierarchy of memories, with the fastest, smallest, and most
expensive memory per bit at the top of the hierarchy and the slowest, largest, and cheapest
per bit at the bottom. Caches give the programmer the illusion that main memory is nearly as
fast as the top of the hierarchy and nearly as big and cheap as the bottom of the hierarchy. We
use a layered triangle icon to represent the memory hierarchy. The shape indicates speed,
cost, and size: the closer to the top, the faster and more expensive per bit the memory; the
wider the base of the layer, the bigger the memory.
8. Dependability via redundancy
Computers not only need to be fast; they need to be dependable. Since any physical device
can fail, we make systems dependable by including redundant components that can take over
when a failure occurs and to help detect failures. We use the tractor-trailer as our icon, since
the dual tires on each side of its rear axles allow the truck to continue driving even when one
tire fails. (Presumably, the truck driver heads immediately to a repair facility so the fl at tire
can be fixed, thereby restoring redundancy!)
1.a.ii. Consider a computer with three instruction classes and CPI measurements are given below and instructions counts for each instructions class for the same program from different compilers are given
CPI CPI FOR EACH INSTRUCTONS CLASS A B C 1 2 3
CPI CPI FOR EACH INSTRUCTONS CLASS A B C
Code 1 2 1 2 Code 2 4 1 1
A. Which code sequence executes the more No.of instructions?
Sequence 1 executes 2 + 1 + 2 = 5 instructions. Sequence 2 executes 4 + 1 + 1 = 6
instructions. Therefore, sequence 1 executes fewer instructions.
B. Which code sequence will be faster?
CPU clock cycles based on instruction count and CPI to find the total number of
clock cycles for each sequence
Page 6 of 13
CPU clock cycles1 = (2 × 1) + (1 × 2) + (2 × 3) = 2 + 2 + 6 = 10 cycles
CPU clock cycles2 = (4 × 1) + (1 × 2) + (1 × 3) = 4 + 2 + 3 = 9 cycles
Code sequence 2 is faster, even though it executes one extra instruction. Since code sequence 2 takes fewer overall clock cycles but has more instructions, it must have a lower CPI.
C. What is the CPI for each sequence?
The CPI values can be computed by
1.b.i. Operations and operands of computer Hardware
Instruction
Performance of H/W – Know the programming language
The word computer language called Instruction and its Vocabulary called
instruction Sets.
Based on the programming language & hardware performance can be varied.
Instruction set – describe the function of architecture.
MIPS technologies and three popular instruction sets
ARM V7 – similar to MIPS. More than 9 billion chips in ARM Processor –
Manufactured in 2011
INTEL X86 – PC and Cloud
ARM V8 – Addressing size from 32 to 64 bits
Computer designer – find outs the language through that Maximizing the performance,
Minimizing the Cost and Energy.
Page 7 of 13
Operation
ARM Assembly Language
Operands
Page 8 of 13
Memory Operands Example: Assignment statement is g= h + A[8]; A[12]= h + A[8];
lw $t0, 32($s3) # load word
add $t0, $s2, $t0
sw $t0, 48($s3) # store word
Big Endian and Little Endian
Constant or Immediate Operands
uses less energy
Perform operations in more fast
Example: addi s3,s3,4; s3 = s3+4
1.b.ii. Various operations in logical and control in computer
It is an order given to a computer processor by a computer program
To perform some task and it can be varied from one language to another language
Register numbers – MIPS Assembly Language
$t0 – $t7 are reg’s 8 – 15
$t8 – $t9 are reg’s 24 – 25
$s0 – $s7 are reg’s 16 – 23
Page 9 of 13
MIPS R-Format Instructions
Instruction fields
– op: operation code (opcode)
– rs: first source register number
– rt: second source register number
– rd: destination register number
– shamt: shift amount (00000 for now)
– funct: function code (extends opcode)
Translating ARM Assembly Instructions into Machine Instructions
add $t0, $s1, $s2
000000100011001001000000001000002 = 0232402016 MIPS I-format Instructions \ Logical Operation
Operation C Java MIPS Shift left << << sll
Shift right >> >>> srl
Bitwise AND & & and, andi
Bitwise OR | | or, ori Bitwise NOT ~ ~ nor
Control Operation
Branch to a labeled instruction if a condition is true
Otherwise, continue sequentially
beq rs, rt, L1 :if (rs == rt) branch to instruction labeled L1;
Page 10 of 13
bne rs, rt, L1: if (rs != rt) branch to instruction labeled L1;
j L1 :unconditional jump to instruction labeled L1
2. a. i. components of computer System
2.b.i. Addressing modes
Way in which the operand of an instruction is specified.
Information contained in the instruction code is the
value of the operand or
the address of the result/operand
Multiple forms of addressing are generically called addressing modes
MIPS Instruction format
Page 11 of 13
2.b.ii. various measures of performance of a computer
Relative Performance
Measuring Execution Time
Elapsed time : Total response time - Determines system performance
Example : Processing, I/O, OS overhead, idle time
CPU time: Time spent processing a given job . Comprises user CPU time and system CPU
time
n XY
YX
time Executiontime ExecutionePerformancePerformanc
Page 12 of 13
CPU Performance & its Factor
RateClock CyclesClock CPU
Time CycleClock CyclesClock CPUTimeExecution CPU
Performance improved by
– Reducing number of clock cycles
– Increasing clock rate
– Hardware designer must often trade off clock rate against cycle count
Instruction Performance
nInstructioper CyclesCountn InstructioCyclesClock CPU Classical CPU Performance Equation
Rate ClockCPICount nInstructio
Time Cycle ClockCPICount nInstructioTime CPUnInstructio per CyclesCount nInstructioCycles Clock
Performance
Time Cycle ClockCPICount nInstructioTime CPU
cycle ClockSeconds
nInstructiocycles Clock
ProgramnsInstructioTime CPU s
Performance depends on
– Algorithm: affects IC, possibly CPI
– Programming language: affects IC, CPI
– Compiler: affects IC, CPI
– Instruction set architecture: affects IC, CPI, Tc
3.a.i. Addition and subtraction in ALU Addition
Direct method with example overflow
Subtraction Direct method with example 1’s Complement with example 2’s Complement with example overflow
Page 13 of 13
3.b.i.booth multiplication Algorithm
*****