CSC258 Week 6258/files/andrew/lec06-ALU.pdf · Announcements §Next week úReading ... 1 1 1 17...

CSC258 Week 6

Announcements

§ Next weekú Reading week! ú No lecture, no lab

§ Week after reading weekú Monday: midtermú Lectures as normalú No lab

Midterm Time & Location

Monday, February 24, 7:10-8:40 p.m.Locations:

• IB-120 (UTORID A-M)• MN-1210 (UTORID N-Z)No aids permitted.

Bring your TCard to scan.

Study Tips

1. Review lectures (especially quizzes) from W1-W6

• Do some of the recommended exercises

2. Review what you did for labs1-5

3. Practice (timed) with a past test

• Example posted on course web page

4. Post questions on the discussion board or visit office hours

Pre-test Office Hours

Larry Zhang

Tuesday, Feb 18, 5 - 6 PMThursday, Feb 20, 11 AM -12:30 PMMonday, Feb 24, 1 - 3 PM

Andrew Petersen

Wednesday, Feb 19, 10 AM - 12 PMFriday, Feb 21, 1 - 3 PMMonday, Feb 24, 10 AM - 12 PM

Lab 5 Tip

In simulation or on the board, I’m getting indeterminate output values all the time. What can I do?

§ Add a RESET signal which ANDs with the inputs of the flip-flops, so when RESET is 0, the inputs are forced set to 0.

§ Use the pre-made D-flip-flop in Logisim

Circuit Timing

Timing

§ So far, we have been worried about whether a circuit is functionally correct.

§ Now let’s think about the delay within a circuit.

§ Key concept: latencyú propagation delayú contamination delay

Delay Example

§ We measure the interval between the two “50% points” of the changing signals

Propagation & Contamination Delay

§ Propagation delay: the maximum time from when an input changes until the output or outputs reach their final value.

§ Contamination delay: the minimum time from when an input changes until any output starts to change its value.

Propagation & Contamination Delay

§ Propagation delay: the delay from inputs received until answer ready.

§ Contamination delay: theamount of time we have before the output is corrupted (unusable).

Given a circuit diagram,calculate its propagation delay and contamination delay

Need to know

§ The propagation and contamination delay of each logic gate used

§ Wire delay is generally ignored ormodeled in a simplified way

Gate t_pd(propagation)

t_cd(contamination)

2-input AND 100 picoseconds 60 picoseconds

2-input OR 120 picoseconds 40 picoseconds

Calculate Propagation Delay

§ Find the critical path (path with the largest number of gates)§ then sum up the propagation delay of all the gates on the

critical path§ 100 + 120 + 100 = 320 picoseconds

Gate t_pd t_cd

2-input AND 100 ps 60 ps

2-input OR 120 ps 40 ps

Calculate Contamination Delay

§ Find the short path (path with the smallest number of gates)§ then sum up the contamination delay of all the gates on the

short path§ 60 picoseconds

Gate t_pd t_cd

2-input AND 100 ps 60 ps

2-input OR 120 ps 40 ps

Knowing how to calculate delays allows us to compare designs by speed for particular technologies .

This allows us to design circuits that are fast.

WE = 1

WE = 0

WE A Y

Quick intro: Tri-state buffer

Example: fast 4-1 mux

two different 4-1 muxes

Example: fast 4-1 mux

• We care about the propagation delays of the two circuits.• it tells us “how soon

can I get the answer”• More specifically, we care

about the D-to-Y delay and S-to-Y delay because D and S may arrive at different time.

§ D-to-Y propagation delay:§ 2 x TRISTATE_AY = 100

§ S-to-Y propagation delay§ TRISTATE_ENY + TRISTATE_AY§ = 35 + 50 = 85

§ D-to-Y propagation delay:§ TRISTATE_AY = 50

§ S-to-Y propagation delay§ NOT + AND2 + TRISTATE_ENY § = 30 + 60 + 35 = 125

Analysis result

§ Circuit 1 propagation:ú D-to-Y: 100 psú S-to-Y: 85 ps

§ Circuit 2 propagationú D-to-Y: 50 psú S-to-Y: 125 ps

§ Which circuit is faster?ú What if D and S arrive at the same time?ú What if D arrives earlier than S?ú What if S arrives earlier than D?

Delays: the lower/higher, the better?

§ Propagation delay, typically, should be upper-bounded.ú shorter propagation means getting

answer fasterú How to make it lower?ú shorten the critical path

§ Contamination delay, typically, should be lower-boundedú want to reliably sample the value

before change.ú How to make it longer?ú add buffers to the short path

New Topic:

Processor Components

Using what we have learned so far (combinational logic, devices, sequential circuits, FSMs), how do we build a processor?

Microprocessors

§ So far, we’ve been talkingabout making devices,such as adders, countersand registers.

§ The ultimate goal is tomake a microprocessor, which is a digital device that processes input, can store values and produces output, according to a set of on-board instructions.

Read reg 1

Read reg 2

Write reg

Write data

Read data 1

Read data 2

Registers

ALU result

Instruction [31-26]

Instruction Register

Instruction [25-21]

Instruction [20-16]

Instruction [15-0] 0

1Memory data

register

Memory data

Memory

Address

Write data

ALU Out

012Shift left 2

PCWriteCond

PCWrite

MemRead

MemWrite

MemtoReg

IRWrite

PCSourceALUOp

ALUSrcB

ALUSrcA

RegWrite

RegDstOpcode

ControlUnit

Shift left 2Sign extend

The Final Destination

Deconstructing Processors

§ Processors aren’t so bad when you consider them piece by piece:

Storage Thing

Arithmetic Thing

Controller Thing

Microprocessors

§ These devices are acombination of theunits that we’vediscussed so far:ú Registers to store values.ú Adders and shifters to process data.ú Finite state machines to control the process.

§ Microprocessors are the basis of all computing since the 1970’s and can be found in nearly every sort of electronics.

aka: the Arithmetic Logic Unit (ALU)

The “Arithmetic Thing”

We are here

Assembly Language

ProcessorsFinite State Machines

Arithmetic Logic Units

Devices Flip-flops

Circuits

Transistors

Arithmetic Logic Unit

§ The first microprocessorapplications were calculators.ú Recall the unit on adders and

subtractors.ú These are part of a larger

structure called the arithmetic logic unit (ALU).

§ This larger structure is responsible for the processing of all data values in a basic CPU.

ALU inputs§ The ALU performs all of

the arithmetic operationscovered in this course sofar, and logical operationsas well (AND, OR, NOT, etc.)ú A and B are the operandsú The select bits (S) indicate which operation is

being performed (S2 is a mode select bit, indicating whether the ALU is in arithmetic or logic mode).

ú The carry bit Cin is used in operations such as incrementing an input value or the overall result.

Cin,S VCNZ

ALU outputs

§ In addition to the inputsignals, there are outputsignals V, C, N & Z whichindicate special conditionsin the arithmetic result:ú V: overflow condition

The result of the operation could not be stored in the n bits of G, meaning that the result is incorrect.

ú C: carry-out bitú N: Negative indicatorú Z: Zero-condition indicator

Cin,S VCNZ

The “A” of ALU§ To understand how the ALU does all of these operations,

let’s start with the arithmetic side.§ Fundamentally, this side is made of an adder / subtractor

unit, which we’ve seen already:

C3Cout

ALU block diagram§ In addition to data inputs and outputs, this circuit

also has: ú outputs indicating the different conditions,ú inputs specifying the operation to perform (similar to Sub).

n-bit ALU

A0A1…An

B0B1…Bn

G0G1…Gn

Data input A

Data input B

Data output G

Carry input

Operation &Mode select

Cout Carry outputOverflow indicatorNegative indicatorZero indicator

Arithmetic components

§ In addition to addition and subtraction, many more operations can be performed by manipulating what is added to input B, as shown in the diagram above.

B inputlogic

n-bit paralleladder

G = X + Y + Cin

Arithmetic operations

§ If the input logic circuit on the left sends B straight through to the adder, result is G = A+B

§ What if Bwas replaced by all ones instead?ú Result of addition operation: G = A-1

§ What if Bwas replaced by B?ú Result of addition operation: G = A-B-1

§ And what if Bwas replaced by all zeroes?ú Result is: G = A. (Not interesting, but useful!)

à Instead of a Sub signal, the operation you want is signaled using the select bits S0 & S1.

Operation selection

§ This is a good start! But something is missing…§ What about the carry bit?

Select bits Y

inputResult Operation

0 0 All 0s G = A Transfer

0 1 B G = A+B Addition

1 0 B G = A+B Subtraction - 1

1 1 All 1s G = A-1 Decrement

Full operation selection

§ Based on the values on the select bits and the carry bit, we can perform any number of basic arithmetic operations by manipulating what value is added to A.

Select Input Operation

S1 S0 Y Cin=0 Cin=1

0 0 All 0s G = A (transfer) G = A+1 (increment)

0 1 B G = A+B (add) G = A+B+1

1 0 B G = A+B G = A+B+1 (subtract)

1 1 All 1s G = A-1 (decrement) G = A (transfer)

The “L” of ALU§ We also want a circuit

that can performlogical operations,in addition toarithmetic ones.

§ How do we tellwhich operationto perform?ú Another select bit!

§ If S2 = 1, then logic circuit block is activated.§ Multiplexer is used to determine which block

(logical or arithmetic) goes to the output.

4-to-1mux

Single ALU Stage

Logiccircuit

AiBi Arithmetic

circuitS0S1

CiCi+1Ci

Multiplication

What about multiplication?

§ Multiplication (and division) operations are always more complicated than other arithmetic (addition, subtraction) or logical (AND, OR) operations.

§ Three major ways that multiplication can be implemented in circuitry:ú Layered rows of adder units.ú An adder/shifter circuitú Booth’s Algorithm

Multiplication

§ Multiplier circuits canbe constructed asan array of addercircuits.

§ This can get expensive as the size of the operands grows.

§ Is there an alternative to this circuit?

Multiplication

§ Revisiting grade 3 math…

123x 456

12 3x 456

1 2 3x 456

1368912

1 23x 456

1368912456

123x 456

1368912456

Multiplication

§ And now, in binary…

101x 110

10 1x 110

1 0 1x 110

110000

1 01x 110

110000110

101x 110

110000110

Observations

§ Calculation flowú Multiply by 1 bit of multiplierú Add to sum and shift sumú Shift multiplier by 1 bitú Repeat the above

§ What is “multiply by 1 bit of binary”?ú 10101 x 1 ?ú 10101 x 0 ?ú It’s an AND!

101x 110

110000110

Accumulator circuits§ What if you could perform each stage of the

multiplication operation, one after the other?ú This circuit would only

need a single row ofadders and a coupleof shift registers.

Register R

Shift Left 1

Register Y

Register X

1xn AND

101x 110

110000110

Make it more efficient

Think about 258 x 9999§ Multiply by 9, add to sum, shift, multiply by 9, add to sum, shift,

multiple by 9, add to sum, shift, multiply by 9, add to sum.

§ 258 x 9999 = 258 x (10000 - 1) = 258 x 10000 – 258§ Just shift 258, becomes 2580000, then do 2580000 – 258§ More efficient!

Efficient Multiplication: Booth’s Algorithm

§ Take advantage of circuits where shifting is cheaper than adding, or where space is at a premium.ú when multiplying by certain values (e.g. 99), it can be easier to think

of this operation as a difference between two products.

§ Consider the shortcut method when multiplying a given decimal value X by 9999:ú X*9999 = X*10000 – X*1

§ Now consider the equivalent problem in binary:ú X*001111 = X*010000 – X*1

§ More details: https://en.wikipedia.org/wiki/Booth%27s_multiplication_algorithm

Reflections on Multiplication

§ Multiplication isn’t as common an operation as addition or subtractio, but occurs enough that its implementation is handled in the hardware.

§ Most common multiplication and division operations are powers of 2. For this, the shift register is used instead of the multiplier circuit.

ú e.g., x * 8 is implemented as x << 3

Storage Thing

Arithmetic Thing

Controller Thing

CSC258 Week 6258/files/andrew/lec06-ALU.pdf · Announcements §Next week úReading ... 1 1 1 17...

Documents

Transcript of CSC258 Week 6258/files/andrew/lec06-ALU.pdf · Announcements §Next week úReading ... 1 1 1 17...

Computer Arithmetic COMPUTER ARITHMETICathena.ecs.csus.edu/~changw/137/lecture/Ch-10-CPU-ALU.pdf · Computer Arithmetic 1 Computer Organization • Arithmetic with Signed-2's Complement

CSC258 Week 11 - University of Toronto258/files/larry/lec11.pdf · CSC258 Week 11 1 Note: all lecture activities will be recorded and the recording will be made available on the CSC258

CSC258 Week 10 - mcs.utm.utoronto.ca

€¦ · XLS file · Web view · 2015-10-041. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.

BTSTAL. Aluminium - Aarhusstaff.iha.dk/lc/aluminium/Notater_opgave/Alu.pdf · Aluminium Materialeværdier og ... Recommended partial factors for ... about the major axis the elastic

Quartus II Introduction Using Schematic Designsylzhang/csc258/files/tut_quartus.pdf · Quartus II Introduction Using Schematic Designs 1Introduction This tutorial presents an introduction

11 Add, Subtract, Compare, ALU - University of Florida Add, Subtract, Compare, ALU.pdf · >Binary, Half & Full •Canonical forms •Binary Subtraction •Full-Subtractor ... •Actually,

CENG 3420 Lecture 05:Arithmetic and Logic Unitbyu/CENG3420/2017Spring/L05-ALU.pdf · CENG3420 L05.10 Spring 2017 Machine Number Representation q Bits are just bits (have no inherent

High Performance Computing: Blue-Gene and Road Runnercs.rochester.edu/u/sandhya/csc258/seminars/Patel_HPC.pdf · Overview of the Blue Gene/L System Architecture. IBM • Journal of

CSC258 Week 8 - University of Toronto258/files/larry/lec08print.pdf · See the “cache exercise” handout (posted on the course website). 3 Discussion What is the effect of increasing:

CSC258 Week 9 - University of Toronto

CPU Architecturessengels/csc258/lectures/M68k_1up.pdf · CSC258 Lecture Slides © Steve Engels, 2006 Slide 1 of 38 CPU Architectures • Several CPU architectures exist currently:

MapReduce: A Programming Model for Large-Scale …kshen/csc258-spring...MapReduce: Overview A programming model for large-scale data-parallel applications introduced by Google Aimed

Morse Code Decoder - Department of Computer Science ...milic/csc258/report.pdf · Morse Code Decoder Sa sa Mili c g3milic Madelaine Page g3maddie We implemented a Morse code decoder

CSC258: Computer Organization Microarchitecturepeters43/258/lectures/w7/processor.pdf · 2. In one sentence, ... The Processor from Lab ... (PC) to the first instruction in memory

CSC384: Intro to Artificial Intelligencesengels/csc258/lectures/Devices_1up.pdf · CSC258 Lecture Slides © Steve Engels, ... – Counters – Multiplexers – Decoders ... simple

1 1 1 1 1 1 1 ¢ 1 , ¢ 1 1 1 , 1 1 1 1 ¡ 1 1 1 1 · 1 1 1 1 1 ] ð 1 1 w ï 1 x v w ^ 1 1 x w [ ^ \ w _ [ 1. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ð 1 ] û w ü

Broodmare Report 5-11-17 · 2017. 5. 11. · 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 < 1 1 1 1 1 1 1 1 1

CSC258 Week 1258/files/larry/lec01.pdf · § On the discussion board, use “Search Teammates” for finding a partner. 16. Prelab Reports ... § The project will be due on the last

CSC258 Week 4 - mcs.utm.utoronto.ca258/files/larry/lec04.pdf · than or equal to the second. ... 0 1 X X 0 1 1 0 X X 1 0 1 1 X X 0 0 Forbidden state SR-latch SR-latch “set” and