2010 Semester 1 0001110010000110 ADD R6,R2,R6 ... tells us the address of the first instruction. •...

39
Computer Science 210 s1c Computer Systems 1 2010 Semester 1 Lecture Notes James Goodman Credits: Slides prepared by Gregory T. Byrd, North Carolina State University Assembly Language Lecture 14, 31Mar10: 31Mar10 CS210 173 Human-Readable Machine Language Computers like ones and zeros… Humans like symbols… Assembler is a program that turns symbols into machine instructions. ! ISA-specific: close correspondence between symbols and instruction set • mnemonics for opcodes • labels for memory locations ! additional operations for allocating storage and initializing data ADD R6,R2,R6 ; increment index reg. 0001110010000110 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 31Mar10 CS210 174 An Assembly Language Program ; ; Program to multiply a number by the constant 6 ; .ORIG x3050 LD R1, SIX LD R2, NUMBER AND R3, R3, #0 ; Clear R3. It will ; contain the product. ; The inner loop ; AGAIN ADD R3, R3, R2 ADD R1, R1, #-1 ; R1 keeps track of BRp AGAIN ; the iteration. ; HALT ; NUMBER .BLKW 1 SIX .FILL x0006 ; .END Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 31Mar10 CS210 175 LC-3 Assembly Language Syntax Each line of a program is one of the following: ! an instruction ! an assembler directive (or pseudo-op) ! a comment Whitespace (between symbols) and case are ignored. Comments (beginning with “;”) are also ignored. An instruction has the following format: LABEL OPCODE OPERANDS ; COMMENTS optional mandatory Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Transcript of 2010 Semester 1 0001110010000110 ADD R6,R2,R6 ... tells us the address of the first instruction. •...

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

James Goodman!

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Assembly Language

Lecture 14, 31Mar10:

31Mar10 CS210 173

Human-Readable Machine Language

Computers like ones and zeros…

Humans like symbols…

Assembler is a program that turns symbols into machine instructions.

! ISA-specific: close correspondence between symbols and instruction set

• mnemonics for opcodes •  labels for memory locations

! additional operations for allocating storage and initializing data

ADD R6,R2,R6 ; increment index reg.

0001110010000110

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 174

An Assembly Language Program

; ; Program to multiply a number by the constant 6 ;

.ORIG x3050 LD R1, SIX LD R2, NUMBER AND R3, R3, #0 ; Clear R3. It will ; contain the product.

; The inner loop ; AGAIN ADD R3, R3, R2

ADD R1, R1, #-1 ; R1 keeps track of BRp AGAIN ; the iteration.

; HALT

; NUMBER .BLKW 1 SIX .FILL x0006 ;

.END

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 175

LC-3 Assembly Language Syntax

Each line of a program is one of the following: ! an instruction ! an assembler directive (or pseudo-op) ! a comment

Whitespace (between symbols) and case are ignored. Comments (beginning with “;”) are also ignored.

An instruction has the following format: LABEL OPCODE OPERANDS ; COMMENTS

optional mandatory

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 176

Opcodes and Operands

Opcodes !  reserved symbols that correspond to LC-3 instructions !  listed in Appendix A

•  ex: ADD, AND, LD, LDR, … Operands

!  registers -- specified by Rn, where n is the register number !  numbers -- indicated by # (decimal) or x (hex) !  label -- symbolic name of memory location !  separated by comma !  number, order, and type correspond to instruction format

•  ex: ADD R1,R1,R3 ADD R1,R1,#3 LD R6,NUMBER BRz LOOP

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 177

Labels and Comments

Label !  placed at the beginning of the line !  assigns a symbolic name to the address corresponding to line

•  ex: LOOP ADD R1,R1,#-1 BRp LOOP

Comment !  anything after a semicolon is a comment !  ignored by assembler !  used by humans to document/understand programs !  tips for useful comments:

•  avoid restating the obvious, as “decrement R1” •  provide additional insight, as in “accumulate product in R6” •  use comments to separate pieces of program

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 178

Assembler Directives

Pseudo-operations ! do not refer to operations executed by program ! used by assembler !  look like instruction, but “opcode” starts with a full stop

Opcode Operand Meaning

.ORIG address starting address of program

.END end of program

.BLKW n allocate n words of storage

.FILL n allocate one word, initialize with value n

.STRINGZ n-character string

allocate n+1 locations, initialize w/characters and null terminator

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 179

Trap Codes

LC-3 assembler provides “pseudo-instructions” for each trap code, so you don’t have to remember them.

Code Equivalent Description

HALT TRAP x25 Halt execution and print message to console.

IN TRAP x23 Print prompt on console, read (and echo) one character from keybd. Character stored in R0[7:0].

OUT TRAP x21 Write one character (in R0[7:0]) to console.

GETC TRAP x20 Read one character from keyboard. Character stored in R0[7:0].

PUTS TRAP x22 Write null-terminated string to console. Address of string is in R0.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 180

Style Guidelines

Use the following style guidelines to improve the readability and understandability of your programs:

1.  Provide a program header, with author’s name, date, etc., and purpose of program.

2. Start labels, opcode, operands, and comments in same column for each line. (Unless entire line is a comment.)

3. Use comments to explain what each register does. 4. Give explanatory comment for most instructions. 5. Use meaningful symbolic names.

•  Mixed upper and lower case for readability. •  ASCIItoBinary, InputRoutine, SaveR1

6. Provide comments between program sections. 7.  Each line must fit on the page -- no wraparound or truncations.

•  Long statements split in aesthetically pleasing manner.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 188

Sample Program

Count the occurrences of a character in a file. Remember this?

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 189

; Program to count occurrences of a character in a file. ; Character to be input from the keyboard. ; Result to be displayed on the monitor. ; Program only works if no more than 9 occurrences are found. ; ; ; Initialization ;

.ORIG x3000 AND R2, R2, #0 ; R2 is counter, initially 0 LD R3, PTR ; R3 is pointer to characters GETC ; R0 gets character input LDR R1, R3, #0 ; R1 gets first character

; ; Test character for end of file ; TEST ADD R4, R1, #-4 ; Test for EOT (ASCII x04)

BRz OUTPUT ; If done, prepare the output ; ; Test character for match. If a match, increment count. ;

NOT R1, R1 ADD R1, R1, R0 ; If match, R1 = xFFFF NOT R1, R1 ; If match, R1 = x0000 BRnp GETCHAR ; If no match, do not increment ADD R2, R2, #1

; ; Get next character from file. ; GETCHAR ADD R3, R3, #1 ; Point to next character.

LDR R1, R3, #0 ; R1 gets next char to test BRnzp TEST

; ; Output the count. ; OUTPUT LD R0, ASCII ; Load the ASCII template

ADD R0, R0, R2 ; Convert binary count to ASCII OUT ; ASCII code in R0 is displayed. HALT ; Halt machine

; ; Storage for pointer and ASCII template ; ASCII .FILL x0030 PTR .FILL x4000

.END

Recommended Homework (no credit—do not turn in)

Download the LC 3 simulator package from the resources page <http://www.cs.auckland.ac.nz/compsci210s1c/resources/>. For running on Windows, read the document LC3WinGuide.pdf. (You may run the simulator under Linux: read the document LC3_unix.pdf).

Follow the instructions for running a programme, creating the files described in the example and execute the programme.

Create a source file from the text of programme discussed in the lecture (figures 5.16 & 7.2 in the book).

Create a “file” starting in the memory at location x4000. Assemble the programme. Execute the programme, typing different characters and make sure the

programme prints the correct result. What goes wrong if the character you enter occurs more than 10 times in the

file?

6/3/10 CS210 190

The Assembly Process

31Mar10 CS210 192

Assembly Process

Convert assembly language file (.asm) into an executable file (.obj) for the LC-3 simulator.

First Pass: ! scan program file !  find all labels and calculate the corresponding addresses;

this is called the symbol table

Second Pass: ! convert instructions to machine language,

using information from symbol table

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 193

First Pass: Constructing the Symbol Table

1.  Find the .ORIG statement, which tells us the address of the first instruction. •  Initialize location counter (LC), which keeps track of the

current instruction.

2.  For each non-empty line in the program: a)  If line contains a label, add label and LC to symbol table. b)  Increment LC.

– NOTE: If statement is .BLKW or .STRINGZ, increment LC by the number of words allocated.

3.  Stop when .END statement is reached.

NOTE: A line that contains only a comment is considered an empty line.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

31Mar10 CS210 195

Second Pass: Generating Machine Language

For each executable assembly language statement, generate the corresponding machine language instruction.

!  If operand is a label, look up the address from the symbol table.

Potential problems: !  Improper number or type of arguments

•  ex: NOT R1,#7 ADD R1,R2 ADD R3,R3,NUMBER

!  Immediate argument too large •  ex: ADD R1,R2,#1023

!  Address (associated with label) more than 256 from instruction •  can’t use PC-relative addressing mode

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Jun-3-10 CS210 200

Practice

Using the symbol table constructed earlier, translate these statements into LC-3 machine language.

Statement Machine Language LD R3,PTR 0010 0110 0001 0001 ADD R4,R1,#-4 0001 1000 0111 1100 LDR R1,R3,#0 0110 0010 1100 0000 BRnp GETCHAR 0000 1010 0000 0001

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Jun-3-10 CS210 205

LC-3 Assembler

Using “assemble” (Unix) or LC3Edit (Windows), generates several different output files.

This one gets loaded into the simulator.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Jun-3-10 CS210 206

Object File Format

LC-3 object file contains ! Starting address (location where program must be loaded),

followed by… ! Machine instructions

Example ! Beginning of “count character” object file looks like this:

0011000000000000 0101010010100000 0010011000010001 1111000000100011

.

.

.

.ORIG x3000 AND R2, R2, #0 LD R3, PTR TRAP x23

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Jun-3-10 CS210 207

Multiple Object Files

An object file is not necessarily a complete program. ! system-provided library routines ! code blocks written by multiple developers

For LC-3 simulator, can load multiple object files into memory, then start executing at a desired address.

! system routines, such as keyboard input, are loaded automatically •  loaded into “system memory,” below x3000 • user code should be loaded between x3000 and xFDFF

! each object file includes a starting address ! be careful not to load overlapping object files

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Jun-3-10 CS210 208

Linking and Loading

Loading is the process of copying an executable image into memory.

! more sophisticated loaders are able to relocate images to fit into available memory

! must readjust branch targets, load/store addresses

Linking is the process of resolving symbols between independent object files.

! suppose we define a symbol in one module, and want to use it in another

! some notation, such as .EXTERNAL, is used to tell assembler that a symbol is defined in another module

!  linker will search symbol tables of other modules to resolve symbols and complete code generation before loading

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Input & Output

Jun-3-10 CS215s1c 211

News from the NYTimes (June ’96)

“When a computer runs out of [RAM memory], modern operating systems automatically use the memory on the hard drive. But today’s hard drives retrieve data at speeds of about 10 milliseconds (millionths of a second). That seems fast until you consider that modern RAM can do this at 60 nanoseconds (billionths of a second), more than 150 times as fast.”

What’s wrong with this statement??

I/O Device Examples

Jun-3-10 CS215s1c 212

Device Behavior Partner Data Rate (KB/sec) Keyboard Input Human 0.01 Mouse Input Human 0.02 Laser Printer Output Human 1000 Graphics Display Output Human 30,000 Network-LAN Input or Output Machine 200-1,000,000 Floppy disk Storage Machine 50 CD-ROM (1x) Storage Machine 150 DVD-ROM (1x) Storage Machine 1352 Magnetic Disk Storage Machine 100,000 Flash Memory Storage Machine 30,000

Jun-3-10 CS215s1c 213

Time Line

10-10 10-7 100 10-9 10-8 10-6 10-5 10-4 10-3 10-2 10-1

Time (Logarithmic Scale)

1 month 1 day 1 hour 1 minute 1 second 1 year

Scale by 31,557,600 Jun-3-10 CS215s1c 214

Speed Line

Time for light to travel 30 cm

One clock period 2 GHz

Total Disk access time

Cache miss time (Memory access time)

Cache hit time

Execute one

instruction (best case)

Time for sound to travel 30 cm

One disk revolution (6-8 ms)

Transfer 1 char at 56K

baud

Read 1 byte from disk

10-10 10-7 100 10-9 10-8 10-6 10-5 10-4 10-3 10-2 10-1

Time (Logarithmic Scale)

Jun-3-10 CS215s1c 215

Synchronisation

! What happens if you try to print a file on a printer already in use? ! What happens if you try to read a character before it’s typed? ! What happens to a sequence of characters you type in before you read

them? ! What happens if you send characters to a printer faster than it can

accept them?

Jun-3-10 CS215s1c 216

I/O Devices are Cantankerous

! Many I/O devices have a mechanical component ! They are very slow relative to electronic speeds ! They respond when they’re ready, not necessarily when it’s convenient ! They may not be willing to wait forever for their input (overrun) ! The CPU is the slave: it must synchronize

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

James Goodman!

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Input & Output

Lecture 17, 21Apr10:

8-220

I/O: Connecting to Outside World

So far, we’ve learned how to: ! compute with values in registers !  load data from memory to registers ! store data from registers to memory

But where does data in memory come from?

And how does data get out of the system so that humans can use it?

8-221

I/O: Connecting to the Outside World

Types of I/O devices characterized by: !  behavior: input, output, storage

•  input: keyboard, motion detector, network interface •  output: monitor, printer, network interface •  storage: disk, CD-ROM

!  data rate: how fast can data be transferred? •  keyboard: 100 bytes/sec •  disk: 30 MB/s •  network: 1 Mb/s - 1 Gb/s

8-222

I/O Controller

Control/Status Registers ! CPU tells device what to do -- write to control register ! CPU checks whether task is done -- read status register

Data Registers ! CPU transfers data to/from device

Device electronics ! performs actual operation

• pixels to screen, bits to/from disk, characters from keyboard

Graphics Controller Control/Status

Output Data Electronics CPU display

8-223

Programming Interface

How are device registers identified? ! Memory-mapped vs. special instructions

How is timing of transfer managed? ! Asynchronous vs. synchronous

Who controls transfer? ! CPU (polling) vs. device (interrupts)

8-224

Memory-Mapped vs. I/O Instructions

Instructions ! designate opcode(s) for I/O ! register and operation encoded in instruction

Memory-mapped ! assign a memory address

to each device register ! use data movement

instructions (LD/ST) for control and data transfer

8-225

Transfer Timing

I/O events generally happen much slower than CPU cycles.

Synchronous ! data supplied at a fixed, predictable rate ! CPU reads/writes every X cycles

Asynchronous ! data rate less predictable ! CPU must synchronize with device,

so that it doesn’t miss data or write too quickly

8-226

Transfer Control

Who determines when the next data transfer occurs?

Polling ! CPU keeps checking status register until

new data arrives OR device ready for next data !  “Are we there yet? Are we there yet? Are we there yet?”

Interrupts ! Device sends a special signal to CPU when

new data arrives OR device ready for next data ! CPU can be performing other tasks instead of polling device. !  “Wake me when we get there.”

8-227

LC-3

Memory-mapped I/O (Table A.3)

Asynchronous devices ! synchronized through status registers

Polling and Interrupts !  the details of interrupts will be discussed in Chapter 10

Location I/O Register Function

xFE00 Keyboard Status Reg (KBSR) Bit [15] is one when keyboard has received a new character.

xFE02 Keyboard Data Reg (KBDR) Bits [7:0] contain the last character typed on keyboard.

xFE04 Display Status Register (DSR) Bit [15] is one when device ready to display another char on screen.

xFE06 Display Data Register (DDR) Character written to bits [7:0] will be displayed on screen.

8-228

Input from Keyboard

When a character is typed: !  its ASCII code is placed in bits [7:0] of KBDR

(bits [15:8] are always zero) !  the “ready bit” (KBSR[15]) is set to one ! keyboard is disabled -- any typed characters will be ignored

When KBDR is read: ! KBSR[15] is set to zero ! keyboard is enabled

KBSR

KBDR 15 8 7 0

15 14 0

keyboard data

ready bit

8-229

Basic Input Routine

new char?

read character

YES

NO

Polling

POLL LDI R0, KBSRPtr BRzp POLL LDI R0, KBDRPtr

...

KBSRPtr .FILL xFE00 KBDRPtr .FILL xFE02

8-230

Simple Implementation: Memory-Mapped Input

Address Control Logic determines whether MDR is loaded from Memory or from KBSR/KBDR.

8-231

Output to Monitor

When Monitor is ready to display another character: !  the “ready bit” (DSR[15]) is set to one

When data is written to Display Data Register: ! DSR[15] is set to zero ! character in DDR[7:0] is displayed ! any other character data written to DDR is ignored

(while DSR[15] is zero)

DSR

DDR 15 8 7 0

15 14 0

output data

ready bit

8-232

Basic Output Routine

screen ready?

write character

YES

NO

Polling

POLL LDI R1, DSRPtr BRzp POLL STI R0, DDRPtr

...

DSRPtr .FILL xFE04 DDRPtr .FILL xFE06

8-233

Simple Implementation: Memory-Mapped Output

Sets LD.DDR or selects DSR as input.

8-234

Keyboard Echo Routine

Usually, input character is also printed to screen. ! User gets feedback on character typed

and knows its ok to type the next character.

new char?

read character

YES

NO

screen ready?

write character

YES

NO

POLL1 LDI R0, KBSRPtr BRzp POLL1 LDI R0, KBDRPtr

POLL2 LDI R1, DSRPtr BRzp POLL2 STI R0, DDRPtr

...

KBSRPtr .FILL xFE00 KBDRPtr .FILL xFE02 DSRPtr .FILL xFE04 DDRPtr .FILL xFE06

8-239

Interrupt-Driven I/O

External device can: (1)  Force currently executing program to stop; (2) Have the processor satisfy the device’s needs; and (3) Resume the stopped program as if nothing happened.

Why? !  Polling consumes a lot of cycles,

especially for rare events – these cycles can be used for more computation.

!  Example: Process previous input while collecting current input. (See Example 8.1 in text.)

8-240

Interrupt-Driven I/O

To implement an interrupt mechanism, we need: ! A way for the I/O device to signal the CPU that an

interesting event has occurred. ! A way for the CPU to test whether the interrupt signal is set

and whether its priority is higher than the current program.

Generating Signal ! Software sets “interrupt enable” bit in device register. ! When ready bit is set and IE bit is set, interrupt is signaled.

KBSR 15 14 0

ready bit 13

interrupt enable bit

interrupt signal to processor

8-241

Priority

Every instruction executes at a stated level of urgency. LC-3: 8 priority levels (PL0-PL7)

! Example: • Payroll program runs at PL0. • Nuclear power correction program runs at PL6.

! It’s OK for PL6 device to interrupt PL0 program, but not the other way around.

Priority encoder selects highest-priority device, compares to current processor priority level, and generates interrupt signal if appropriate.

8-242

Testing for Interrupt Signal

CPU looks at signal between STORE and FETCH phases. If not set, continues with next instruction. If set, transfers control to interrupt service routine.

EA

OP

EX

S

F

D

interrupt signal?

Transfer to ISR

NO

YES

More details in Chapter 10.

8-243

Full Implementation of LC-3 Memory-Mapped I/O

Because of interrupt enable bits, status registers (KBSR/DSR) must be written, as well as read.

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

James Goodman!

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Subroutines & Traps Lecture 19, 26Apr10:

9-250

System Calls

Certain operations require specialized knowledge and protection: ! specific knowledge of I/O device registers

and the sequence of operations needed to use them ! I/O resources shared among multiple users/programs;

a mistake could affect lots of other users!

Not every programmer knows (or wants to know) this level of detail

Provide service routines or system calls (part of operating system) to safely and conveniently perform low-level, privileged operations

9-251

System Call

1. User program invokes system call. 2. Operating system code performs operation. 3. Returns control to user program.

In LC-3, this is done through the TRAP mechanism.

9-252

LC-3 TRAP Mechanism

1. A set of service routines. ! part of operating system -- routines start at arbitrary addresses

(convention is that system code is “below” x3000) ! up to 256 routines

2. Table of starting addresses. ! stored at x0000 through x00FF in memory ! called System Control Block in some architectures

3. TRAP instruction. ! used by program to transfer control to operating system ! 8-bit trap vector names one of the 256 service routines

4. A linkage back to the user program. ! want execution to resume

immediately after the TRAP instruction

9-253

TRAP Instruction

Trap vector !  identifies which system call to invoke ! 8-bit index into table of service routine addresses

•  in LC-3, this table is stored in memory at 0x0000 – 0x00FF • 8-bit trap vector is zero-extended into 16-bit memory address

Where to go !  lookup starting address from table; place in PC

How to get back ! save address of next instruction (current PC) in R7

9-254

TRAP

NOTE: PC has already been incremented during instruction fetch stage.

9-255

RET (JMP R7)

How do we transfer control back to instruction following the TRAP?

We saved old PC in R7. ! JMP R7 gets us back to the user program at the right spot.

! LC-3 assembly language lets us use RET (return) in place of “JMP R7”.

Must make sure that service routine does not change R7, or we won’t know where to return.

9-256

TRAP Mechanism Operation

1.  Lookup starting address. 2.  Transfer to service routine. 3.  Return (JMP R7).

9-257

Example: Using the TRAP Instruction

.ORIG x3000 LD R2, TERM ; Load negative ASCII ‘7’

LD R3, ASCII ; Load ASCII difference AGAIN TRAP x23 ; input character

ADD R1, R2, R0 ; Test for terminate BRz EXIT ; Exit if done ADD R0, R0, R3 ; Change to lowercase TRAP x21 ; Output to monitor... BRnzp AGAIN

ASCII .FILL x0020 ; lowercase bit EXIT TRAP x25 ; halt

.END

9-258

Example: Output Service Routine

.ORIG x0430 ; syscall address ST R7, SaveR7 ; save R7 & R1 ST R1, SaveR1 …

; ----- Write character TryWrite LDI R1, CRTSR ; get status

BRzp TryWrite ; look for bit 15 on WriteIt STI R0, CRTDR ; write char

… ; ----- Return from TRAP Return LD R1, SaveR1 ; restore R1 & R7

LD R7, SaveR7 RET ; back to user

CRTSR .FILL xF3FC CRTDR .FILL xF3FF SaveR1 .FILL 0 SaveR7 .FILL 0

.END

stored in table, location x21

9-259

TRAP Routines and their Assembler Names

vector symbol routine

x20 GETC read a single character (no echo)

x21 OUT output a character to the monitor

x22 PUTS write a string to the console

x23 IN print prompt to console, read and echo character from keyboard

x25 HALT halt the program

9-260

Saving and Restoring Registers

Must save the value of a register if: ! Its value will be destroyed by service routine, and ! We will need to use the value after that action.

Who saves? ! caller of service routine?

• knows what it needs later, but may not know what gets altered by called routine

! called service routine? • knows what it alters, but does not know what will be needed later

by calling routine

9-261

Example

LEA R3, Binary LD R6, ASCII ; char->digit template LD R7, COUNT ; initialize to 10

AGAIN TRAP x23 ; Get char ADD R0, R0, R6 ; convert to number STR R0, R3, #0 ; store number ADD R3, R3, #1 ; incr pointer ADD R7, R7, -1 ; decr counter BRp AGAIN ; more? BRnzp NEXT

ASCII .FILL xFFD0 COUNT .FILL #10 Binary .BLKW #10 What’s wrong with this routine?

What happens to R7?

9-269

Saving and Restoring Registers

Called routine -- “callee-save” ! Before start, save any registers that will be altered

(unless altered value is desired by calling program!) ! Before return, restore those same registers

Calling routine -- “caller-save” ! Save registers destroyed by own instructions or

by called routines (if known), if values needed later •  save R7 before TRAP •  save R0 before TRAP x23 (input character)

! Or avoid using those registers altogether

Values are saved by storing them in memory.

9-270

Question

Can a service routine call another service routine?

If so, is there anything special the calling service routine must do?

9-271

What about User Code?

Service routines provide three main functions: 1. Shield programmers from system-specific details. 2. Write frequently-used code just once. 3. Protect system resources from malicious/clumsy

programmers.

Are there any reasons to provide the same functions for non-system (user) code?

9-272

Subroutines

A subroutine is a program fragment that: !  lives in user space ! performs a well-defined task !  is invoked (called) by another user program ! returns control to the calling program when finished

Like a service routine, but not part of the OS ! not concerned with protecting hardware resources ! no special privilege required

Reasons for subroutines: ! reuse useful (and debugged!) code without having to

keep typing it in ! divide task among multiple programmers ! use vendor-supplied library of useful routines

9-274

JSR Instruction

Jumps to a location (like a branch but unconditional), and saves current PC (addr of next instruction) in R7. ! saving the return address is called “linking” !  target address is PC-relative (PC + Sext(IR[10:0])) ! bit 11 specifies addressing mode

•  if =1, PC-relative: target address = PC + Sext(IR[10:0]) •  if =0, register: target address = contents of register IR[8:6]

9-275

JSR

NOTE: PC has already been incremented during instruction fetch stage.

9-276

JSRR Instruction

Just like JSR, except Register addressing mode. !  target address is Base Register ! bit 11 specifies addressing mode

What important feature does JSRR provide that JSR does not?

9-277

JSRR

NOTE: PC has already been incremented during instruction fetch stage.

9-278

Returning from a Subroutine

RET (JMP R7) gets us back to the calling routine. !  just like TRAP

9-279

Example: Negate the value in R0

2sComp NOT R0, R0 ; flip bits ADD R0, R0, #1 ; add one RET ; return to caller

To call from a program (within 1024 instructions):

; need to compute R4 = R1 - R3 ADD R0, R3, #0 ; copy R3 to R0 JSR 2sComp ; negate ADD R4, R1, R0 ; add to R1 ...

Note: Caller should save R0 if we’ll need it later!

9-280

Passing Information to/from Subroutines

Arguments ! A value passed in to a subroutine is called an argument. ! This is a value needed by the subroutine to do its job. ! Examples:

•  In 2sComp routine, R0 is the number to be negated •  In OUT service routine, R0 is the character to be printed. •  In PUTS routine, R0 is address of string to be printed.

Return Values ! A value passed out of a subroutine is called a return value. ! This is the value that you called the subroutine to compute. ! Examples:

•  In 2sComp routine, negated value is returned in R0. •  In GETC service routine, character read from the keyboard

is returned in R0.

9-281

Using Subroutines

In order to use a subroutine, a programmer must know: !  its address (or at least a label that will be bound to its address) !  its function (what does it do?)

• NOTE: The programmer does not need to know how the subroutine works, but what changes are visible in the machine’s state after the routine has run.

!  its arguments (where to pass data in, if any) !  its return values (where to get computed data, if any)

9-282

Saving and Restore Registers

Since subroutines are just like service routines, we also need to save and restore registers, if needed.

Generally use “callee-save” strategy, except for return values. ! Save anything that the subroutine will alter internally

that shouldn’t be visible when the subroutine returns. ! It’s good practice to restore incoming arguments to

their original values (unless overwritten by return value).

Remember: You MUST save R7 if you call any other subroutine or service routine (TRAP). ! Otherwise, you won’t be able to return to caller.

9-288

Library Routines

Vendor may provide object files containing useful subroutines ! don’t want to provide source code -- intellectual property ! assembler/linker must support EXTERNAL symbols

(or starting address of routine must be supplied to user) ...

.EXTERNAL SQRT ...

LD R2, SQAddr ; load SQRT addr JSRR R2 ...

SQAddr .FILL SQRT

Using JSRR, because we don’t know whether SQRT is within 1024 instructions.

Chapter 10 And, Finally... The Stack

10-292

Stack: An Abstract Data Type

An important abstraction that you will encounter in many applications.

We will describe three uses: Interrupt-Driven I/O

! The rest of the story…

Evaluating arithmetic expressions ! Store intermediate results on stack instead of in registers

Data type conversion ! 2’s comp binary to ASCII strings

10-293

Stacks

A LIFO (last-in first-out) storage structure. ! The first thing you put in is the last thing you take out. ! The last thing you put in is the first thing you take out.

This means of access is what defines a stack, not the specific implementation.

Two main operations: PUSH: add an item to the stack POP: remove an item from the stack

10-294

A Physical Stack

Coin rest in the arm of an automobile

First quarter out is the last quarter in.

1995 1996 1998 1982 1995

1998 1982 1995

Initial State After One Push

After Three More Pushes

After One Pop

10-295

A Hardware Implementation

Data items move between registers

/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / /

Yes Empty:

TOP #18 / / / / / / / / / / / / / / / / / / / / / / / /

No Empty:

TOP #12 #5 #31 #18

/ / / / / /

No Empty:

TOP #31 #18

/ / / / / / / / / / / / / / / / / /

No Empty:

TOP

Initial State After One Push

After Three More Pushes

After Two Pops

10-296

A Software Implementation

Data items don't move in memory, just our idea about there the TOP of the stack is.

/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / TOP

/ / / / / / / / / / / / / / / / / /

#18 / / / / / /

TOP

#12 #5

#31 #18

/ / / / / /

TOP #12 #5

#31 #18

/ / / / / /

TOP

Initial State After One Push

After Three More Pushes

After Two Pops

x4000 x3FFF x3FFC x3FFE R6 R6 R6 R6

By convention, R6 holds the Top of Stack (TOS) pointer.

10-297

Basic Push and Pop Code

For our implementation, stack grows downward (when item added, TOS moves closer to 0)

Push ADD R6, R6, #-1 ; decrement stack ptr

STR R0, R6, #0 ; store data (R0)

Pop LDR R0, R6, #0 ; load data from TOS ADD R6, R6, #1 ; decrement stack ptr

10-298

Pop with Underflow Detection

If we try to pop too many items off the stack, an underflow condition occurs. ! Check for underflow by checking TOS before removing data. ! Return status code in R5 (0 for success, 1 for underflow)

POP LD R1, EMPTY ; EMPTY = -x4000 ADD R2, R6, R1 ; Compare stack pointer BRz FAIL ; with x3FFF LDR R0, R6, #0 ADD R6, R6, #1 AND R5, R5, #0 ; SUCCESS: R5 = 0 RET FAIL AND R5, R5, #0 ; FAIL: R5 = 1 ADD R5, R5, #1 RET EMPTY .FILL xC000

10-299

Push with Overflow Detection

If we try to push too many items onto the stack, an overflow condition occurs. ! Check for underflow by checking TOS before adding data. ! Return status code in R5 (0 for success, 1 for overflow)

PUSH LD R1, MAX ; MAX = -x3FFB ADD R2, R6, R1 ; Compare stack pointer BRz FAIL ; with x3FFF ADD R6, R6, #-1 STR R0, R6, #0 AND R5, R5, #0 ; SUCCESS: R5 = 0 RET FAIL AND R5, R5, #0 ; FAIL: R5 = 1 ADD R5, R5, #1 RET MAX .FILL xC005

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

James Goodman!

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Stacks Lecture 21, 29Apr10:

Stack Implementation Details

In example, the first location (largest address) is never used • Push: Decrement SP, then store • Pop: Load using SP, then increment

Notice that SP always points to top element on the stack • Unless it is empty

Alternative implementation • Push: Store using SP, then decrement SP • Pop: Increment SP, then load

In first scheme, SP points to first element, but invalid address when stack is empty, and that address is never used • But points to invalid address when stack is empty • That location is never used!

In second scheme, SP points to first free location in stack

•  But points to invalid address when the stack is full

6/3/10 CS210 307 10-308

Interrupt-Driven I/O (Part 2)

Interrupts were introduced in Chapter 8. 1.  External device signals need to be serviced. 2.  Processor saves state and starts service routine. 3.  When finished, processor restores state and resumes program.

Chapter 8 didn’t explain how (2) and (3) occur, because it involves a stack.

Now, we’re ready…

Interrupt is an unscripted subroutine call, triggered by an external event.

10-309

Processor State

What state is needed to completely capture the state of a running process?

Processor Status Register !  Privilege [15], Priority Level [10:8], Condition Codes [2:0]

Program Counter ! Pointer to next instruction to be executed.

Registers ! All temporary state of the process that’s not stored in memory.

10-310

Where to Save Processor State?

Can’t use registers. ! Programmer doesn’t know when interrupt might occur,

so she can’t prepare by saving critical registers. ! When resuming, need to restore state exactly as it was.

Memory allocated by service routine? ! Must save state before invoking routine,

so we wouldn’t know where. ! Also, interrupts may be nested –

that is, an interrupt service routine might also get interrupted!

Use a stack! ! Location of stack “hard-wired”. ! Push state to save, pop to restore.

10-311

Supervisor Stack

A special region of memory used as the stack for interrupt service routines. ! Initial Supervisor Stack Pointer (SSP) stored in Saved.SSP. ! Another register for storing User Stack Pointer (USP):

Saved.USP.

Want to use R6 as stack pointer. ! So that our PUSH/POP routines still work.

When switching from User mode to Supervisor mode (as result of interrupt), save R6 to Saved.USP.

10-312

Invoking the Service Routine – The Details

1.  If Priv = 1 (user), Saved.USP = R6, then R6 = Saved.SSP.

2.  Push PSR and PC to Supervisor Stack. 3.  Set PSR[15] = 0 (supervisor mode). 4.  Set PSR[10:8] = priority of interrupt being serviced. 5.  Set PSR[2:0] = 0. 6.  Set MAR = x01vv, where vv = 8-bit interrupt vector

provided by interrupting device (e.g., keyboard = x80). 7.  Load memory location (M[x01vv]) into MDR. 8.  Set PC = MDR; now first instruction of ISR will be fetched.

Note: This all happens between the STORE RESULT of the last user instruction and the FETCH of the first ISR instruction.

10-313

Returning from Interrupt

Special instruction – RTI – that restores state.

1.  Pop PC from supervisor stack. (PC = M[R6]; R6 = R6 + 1) 2.  Pop PSR from supervisor stack. (PSR = M[R6]; R6 = R6 + 1) 3.  If PSR[15] = 1, R6 = Saved.USP.

(If going back to user mode, need to restore User Stack Pointer.)

RTI is a privileged instruction. !  Can only be executed in Supervisor Mode. !  If executed in User Mode, causes an exception.

(More about that later.)

10-314

Example (1)

/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / /

x3006 PC

Program A

ADD x3006

Executing ADD at location x3006 when Device B interrupts.

Saved.SSP

10-315

Example (2)

/ / / / / /

x3007 PSR for A

/ / / / / /

/ / / / / /

x6200 PC

R6

Program A

ADD x3006

Saved.USP = R6. R6 = Saved.SSP. Push PSR and PC onto stack, then transfer to Device B service routine (at x6200).

x6200

ISR for Device B

x6210 RTI

10-316

Example (3)

/ / / / / /

x3007 PSR for A

/ / / / / /

/ / / / / /

x6203 PC

R6

Program A

ADD x3006

Executing AND at x6202 when Device C interrupts.

x6200

ISR for Device B

AND x6202

x6210 RTI

10-317

Example (4)

/ / / / / /

x3007 PSR for A

x6203 PSR for B

x6300 PC

R6

Program A

ADD x3006

x6200

ISR for Device B

AND x6202

ISR for Device C

Push PSR and PC onto stack, then transfer to Device C service routine (at x6300).

x6300

x6315 RTI

x6210 RTI

10-318

Example (5)

/ / / / / /

x3007 PSR for A

x6203 PSR for B

x6203 PC

R6

Program A

ADD x3006

x6200

ISR for Device B

AND x6202

ISR for Device C

Execute RTI at x6315; pop PC and PSR from stack.

x6300

x6315 RTI

x6210 RTI

10-319

Example (6)

/ / / / / /

x3007 PSR for A

x6203 PSR for B

x3007 PC

Program A

ADD x3006

x6200

ISR for Device B

AND x6202

ISR for Device C

Execute RTI at x6210; pop PSR and PC from stack. Restore R6. Continue Program A as if nothing happened.

x6300

x6315 RTI

x6210 RTI

Saved.SSP

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

James Goodman!

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Real Processors: Alpha, MIPS & the X86

Lecture 22, 3May10:

Jun-3-10 CS210 340

What’s So Great About the ALPHA?

1.  It’s real Well, it once was

2.  It’s the best/fastest/cleanest Really!

3.  “A design to last 25 years” Uhhh…

Jun-3-10 CS210 341

Ideas Same in LC-3 & MIPS & Alpha

•  von Neumann computer ! Implemented with a finite-state machines ! Performs same basic fetch/execute cycle

•  Fixed-length instructions (32-bits)[see X-86]

•  General-purpose registers ! 2n registers ! Load/Store architecture[see X-86]

•  JSR/RET •  TRAP (CALLSYS[Alpha], SYSCALL[MIPS])

Jun-3-10 CS210 342

History of Digital Equipment Corporation (DEC)

•  Founded in 1957 •  PDP-8 (1964) 12-bit computer •  PDP-11 (1970) 16-bit computer •  VAX (1976) 32-bit computer •  Alpha

! EV4: 1992; 192MHz ! EV5: 1995; 333MHz ! EV6: 1998; 450MHz (eventually 1.25GHz) ! EV7: 2003; 1.15GHz

•  DEC bought by Compaq (later bought by HP): 1998 •  Alpha IP sold to Intel: 2001 •  Intel phased out Alpha in favour of Itanium: 2004

Jun-3-10 CS210 343

Beyond a Byte

• The Alpha is a 64-bit computer • Registers (32) are 64 bits wide •  Instructions are 32 bits • Addresses can be up to 55 bits (!254 = 18 quadrillion bytes of memory) • Operate instructions exist for

! Bytes (8 bits) ! Words (16 bits) ! Longwords (32 bits) ! Quadwords (64 bits)

• Load/store instructions exist for different sized operands !  lb/stb (byte) !  lw/sw (word) !  ll/stl (longword) !  lq/stq (quadword) ! Smaller operands go into least significant bits of register

• Floating point: 64 more registers; more operations

Jun-3-10 CS210 344

Instruction Format

The MIPS Architecture

Jun-3-10 CS215s2c 346

References

Good starting point for the MIPS architecture: http://en.wikipedia.org/wiki/MIPS_architecture

!  Very nice summary of architecture !  Lots of pointers to other material

Read (more) about the MIPS architecture !  http://www.mrc.uidaho.edu/mrc/people/jff/digital/MIPSir.html

•  MIPS Instruction reference !  http://www.xs4all.nl/~vhouten/mipsel/r3000-isa.html

•  Student paper summarizing MIPS Instruction Set !  http://www.langens.eu/tim/ea/mips_en.php

•  Lots of MIPS documentation: !  http://chortle.ccsu.edu/AssemblyTutorial/TutorialContents.html

•  Tutorial on MIPS Assembly Language: !  http://www.cs.wisc.edu/~larus/HP_AppA.pdf

•  Patterson&Hennessy (CS 313 textbook) Appendix A: SPIM, a MIPS simulator (pdf)

Jun-3-10 CS210 347

The LC-3 Instructions

Jun-3-10 CS210 348

The Alpha Instructions

Jun-3-10 CS210 349

The Alpha Instructions

Jun-3-10 CS215s2c 350

The MIPS Computer

!"#$

"!$ %&$

!'&($

)*+,-.$

)/&$

)0&$

&*12$

3-45*$

%6785$

985785$

%:9$

9"$ /(#$

&;$

<4:(,$

;=5>"5>$&*?@$

!"#"$%&'()$*+,"'-"./,0"$,'

Jun-3-10 CS215s2c 351

Registers

! 32 general registers • $0 - $31; also names • $0 is special

– when read, gives zero – writing has no effect

• $31 sometimes implicit in instruction ! 16/32 floating-point registers

• $fgr0-$fgr31 32-bit floating-point registers • Can be configured as 16 64-bit registers

! Special-purpose registers • Hi/Lo (multiplication/division) • Floating-point control/status registers

Jun-3-10 CS215s2c 352

Pseudoinstructions

Some “instructions” are not implemented in the hardware, but are synthesised from two or more real instructions. These instructions are recognized by the assembler and automatically synthesised.

For purposes of this class, we will generally ignore the distinction.

Jun-3-10 CS215s2c 353

Categories of Instructions

1.  Arithmetic/Logical [LC-3: Operate Instructions] a.  Arithmetic b.  Logical c.  Shift d.  Compare [LC-3 equivalent?]

2.  Control a.  Branch on condition b.  Jump c.  Special

3.  Data transfer a.  Load b.  Store c.  Move(copy) d.  Load address

Jun-3-10 CS215s2c 354

1a. Arithmetic Instructions

ADD, SUB, MUL, DIV, REM, Two sources, one destination (can be common)

! Form: add D,S1,S2 D ← S1 + S2 • D, S1 are registers. • S2 can be a register or an immediate, i.e., value contained in the

instruction. ! Multiple operand sizes (8, 16, 32, 64 bits) ! Signed and unsigned arithmetic

• add (signed) • addu (unsigned) •  Difference: unsigned never overflows

! Overflow •  Addition & subtraction: only one bit •  Multiplication: none because result is twice as big

Jun-3-10 CS215s2c 355

1b. Logical Instructions

Instructions: AND, OR, XOR, NOR, NOT Two sources (one for NOT), one destination Form: and D,S1,S2 D ← S1 AND S2

! D, S1 are registers. ! S2 can be a register or an immediate, i.e., value contained in the

instruction.

Multiple operand sizes (8, 16, 32, 64 bits) Overflow: none

Jun-3-10 CS215s2c 356

LC-3 Logical Operations

A B

0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1

1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1

1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

AND NOT

Possible Functions of A, B

Jun-3-10 CS215s2c 357

MIPS Logical Operations

A B

0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1

1 0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1

1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

AND NOT

Possible Functions of A, B

OR NOR XOR

Jun-3-10 CS215s2c 358

Categories of Instructions

1.  Arithmetic/Logical [LC-3: Operate Instructions] a.  Arithmetic b.  Logical c.  Shift d.  Compare [LC-3 equivalent?]

2.  Control a.  Branch on condition b.  Jump c.  Special

3.  Data transfer a.  Load b.  Store c.  Move(copy) d.  Load address

Jun-3-10 CS215s2c 359

1c. Shift Operations

Form: sll D,S,AMT AMT is a count, equivalent to AMT shifts by 1 place. There are three types of Shift Operations

!  logical (srl, sll) ! arithmetic (sra, sll) ! rotate (rr)

Jun-3-10 CS215s2c 362

Shift Operations

Right Rotate Operation:

No information lost For N-bit word, rotate right N positions has no effect Rotate right i positions is same as rotate left N – i positions Not implemented in MIPS (why not?)

msb lsb

Jun-3-10 CS215s2c 363

Logical Shift Operations

Right Logical Shift Operation:

MIPS instruction: srl Java equivalent: >>>

0

msb lsb

discard

Jun-3-10 CS215s2c 364

Logical Shift Operations

Left Logical Shift Operation:

MIPS instruction: sll Java equivalent: <<

0

msb lsb

Discard

Jun-3-10 CS215s2c 365

Arithmetic Shift Operations

Right Arithmetic Shift Operation ! Unsigned integer division by power of 2

Round down (toward negative infinity) MIPS instruction: sra Java equivalent: >>

! same as integer division by power of 2???

msb lsb

discard

Jun-3-10 CS215s2c 366

Arithmetic Shift Operations

Left Arithmetic Shift Operation ! Unsigned integer multiplication by power of 2

Overflow if MSB changes

MIPS instruction: sll (no sla) Java equivalent: ‘* 2i’

0

msb lsb

Discard?

Same as logical le! shi!!

Jun-3-10 CS215s2c 367

Categories of Instructions

1.  Arithmetic/Logical [LC-3: Operate Instructions] a.  Arithmetic b.  Logical c.  Shift d.  Compare [LC-3 equivalent?]

2.  Control a.  Branch on condition b.  Jump c.  Special

3.  Data transfer a.  Load b.  Store c.  Move(copy) d.  Load address

Jun-3-10 CS215s2c 368

2a. Control Instructions

Basic instruction for choosing alternate instruction path: ! Branch on condition: bne R1,R2,L1

• True if R1,R2 are unequal ! Possible tests

• beq : R1 = R2 are equal ? • bne : R1 ! R2 ? • bgt : R1 > R2 ? • blt : R1 < R2 ? • bge : R1 " R2 ? • ble : R1 # R2 ? • b : Unconditional

Jun-3-10 CS215s2c 369

Other MIPS Control Instructions

2b. Jump

jmp Unconditional, large/unlimited range jal Unconditional, but save address for return

2c. Special

syscall Invoke operating system break Invoke operating system rfe Return from exception

Jun-3-10 CS215s2c 370

Categories of Instructions

1.  Arithmetic/Logical [LC-3: Operate Instructions] a.  Arithmetic b.  Logical c.  Shift d.  Compare [LC-3 equivalent?]

2.  Control a.  Branch on condition b.  Jump c.  Special

3.  Data transfer a.  Load b.  Store c.  Move(copy) d.  Load address

LC-3: 4 Load Instructions

Really 4 addressing modes: •  LEA Rd, Label ; Rd = PC + SEXT(PCoffset9) •  LD Rd, Label ; Rd = mem[PC + SEXT(PCoffset9)] •  LDI Rd, Label ; Rd = mem[ mem[PC + SEXT(PCoffset9)]]

•  LDR Rd, Rb, offset6 ; Rd = mem[Rb + SEXT( offset6)]

6/3/10 CS210 372

MIPS: 1 Load Instruction (others are synthesized)

•  LW Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)]

Jun-3-10 CS215s2c 373

Load Reg, Disp(Base)

Effective address: (Base) + Displacement Base specifies the content of a register Displacement is a 16-bit signed constant, sign-extended Displacement defines position relative to Base

=A$ B1@*$ 0*@5$

CDE45@$ FDE45@$FDE45@$

04@7=1G*+*65$

HCDE45@$

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

James Goodman!

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

Real Processors: Alpha, MIPS & the X86

Lecture 23, 5May10:

MIPS: Variations of Load: Byte, Halfword, Word, Longword

6/3/10 CS210 384

•  LB Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)] •  LH Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)] •  LW Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)] •  LL Rd, (Rb)offset16 ; Rd = mem[Rb + SEXT( offset16)]

•  LUI Rd, constant ; Rd = constant<<16

Jun-3-10 CS215s2c 385

Insert Constant

LC-3: Can insert 5-bit constant with AND immediate

MIPS: Can insert 16-bit constant with AND or OR immediate ! But MIPS has 32/64 bits ! How to insert constant in upper bits?

Load Upper Immediate: LUI ! Inserts immediate 16 bits upper, clears lower 16 bits

Matched with another instruction to provide arbitrary 32-bit load address with two instructions (but not 64-bit load)

Jun-3-10 CS215s2c 386

3. Memory Instructions

3a. Load byte, half-word, word, longword

lb … Load sign-extended byte lbu … Load zero-extended byte lh … Load sign-extended two bytes (halfword) …

3b. Store byte, half-word, word, longword

sb … Store byte sh … Store halfword sw … Store word …

Views of Memory

6/3/10 CS210 387

Jun-3-10 CS215s2c 388

MIPS/Alpha/X86 View of Memory

Also true for longwords (64 bits)

IIJ$III$

IIK$IIC$IIL$

IIG$II*$

II1$

IHI$

IHK$IHJ$

IHC$IHL$IH1$

IH*$IJI$

IHG$

IJJ$M$

IIH$III$

IIJ$IIN$IIK$

IIC$IIO$

IIF$

IIL$

II1$IIP$

IIE$IIG$II2$

IIQ$IHI$

II*$

IHH$M$

IIK$III$

IIL$IIG$IHI$

IHL$IHG$

IHK$

IJI$

IJL$IJK$

IJG$INI$INK$

ING$IKI$

INL$

IKK$M$

Size of box: 8 bits

Jun-3-10 CS215s2c 390

Byte Order

Little Endian

IIJ$III$

IIK$IIC$IIL$

IIG$II*$

II1$

IHI$

IHK$IHJ$

IHC$IHL$IH1$

IH*$IJI$

IHG$

IJJ$M$

IIH$III$

IIJ$IIN$IIK$

IIC$IIO$

IIF$

IIL$

II1$IIP$

IIE$IIG$II2$

IIQ$IHI$

II*$

IHH$M$

IIK$III$

IIL$IIG$IHI$

IHL$IHG$

IHK$

IJI$

IJL$IJK$

IJG$INI$INK$

ING$IKI$

INL$

IKK$M$

HH$ HI$ IH$ II$H$ I$

Jun-3-10 CS215s2c 391

Byte Order

Big Endian

IIJ$III$

IIK$IIC$IIL$

IIG$II*$

II1$

IHI$

IHK$IHJ$

IHC$IHL$IH1$

IH*$IJI$

IHG$

IJJ$M$

IIH$III$

IIJ$IIN$IIK$

IIC$IIO$

IIF$

IIL$

II1$IIP$

IIE$IIG$II2$

IIQ$IHI$

II*$

IHH$M$

IIK$III$

IIL$IIG$IHI$

IHL$IHG$

IHK$

IJI$

IJL$IJK$

IJG$INI$INK$

ING$IKI$

INL$

IKK$M$

II$ IH$ HI$ HH$I$ H$

Jun-3-10 CS215s2c 392

Views of Memory

An array of longwords (little-endian)

IIL$III$

IHI$IHL$IJI$

INI$INL$

IJL$

IKI$

IFI$IKL$

IFL$ICI$ICL$

IOL$ILI$

IOI$

ILL$M$

/=4?6*2$2,8E=*2A,-2DRIILS$

#61=4?6*2$2,8E=*A,-2DRIHQS$

Jun-3-10 CS210 393

Unaligned bytes

7 0 23 15 39 31 63 55 47

7 0 23 15 39 31 63 55 47

7 0 23 15 39 31 63 55 47

Jun-3-10 CS215s2c 394

Instruction Format

R-Type: Operate (3 registers)

I-Type: Operate (2 registers + Immediate)

I-Type: Branch on condition

J-Type: Jump

I-Type: Load/Store

Src Reg 1 Dest Reg Operation Src Reg 2 0 0 0 0 0

Immediate Op code Src Reg Dest Reg

Offset Op code Reg 1 Reg 2

Target Op code

Offset Op code Base Dest Reg

0 0 0 0 0 0

Jun-3-10 CS210 395

Interesting ideas in MIPS/Alpha not in the LC-3

•  Shift instruction •  Subtract, multiply, divide/mod! Also Square root (floating

point) •  Logical operations •  Operands of size other than 16 bits •  Operands of size other than register •  Alignment issues (big- vs. little-endian) •  Branch test using register value or compare •  No condition code •  “Zero” register •  Fewer addressing modes (!) •  Clean separation between instructions and data •  Virtual memory

Computer Science 210 s1c Computer Systems 1

2010 Semester 1

Lecture Notes

Credits: Slides prepared by Gregory T. Byrd, North Carolina State University

From LC-3 to x86

Jun-3-10 CS210 416

From LC-3 to x86

•  Appendix B from Introduction to Computing Systems: from bits & gates to C and beyond, by Yale Patt and Sanjay Patel, 2nd Edition (2004), McGraw-Hill

•  The material in Appendix B.1, pp. 547-557 will not be included in the test, but will be covered on the final exam.

Jun-3-10 CS210 417

Jun-3-10 CS210 418 Jun-3-10 CS210 419

X-86 History

•  1974: Intel i8080 (8 bits) •  1979: 8086/8088 (16 bits) •  1982: 80286 (16 bits) •  1985: 80386 (32 bits) •  1989: 80486 (32 bits) •  1992: Pentium (32 bits) •  1995: PentiumPro (32 bits) •  1997: Pentium II (32 bits) •  1999: Pentium III (32 bits) •  2001: Pentium 4 (32 bits) •  2006: Xeon Woodcrest (64-bit) •  2006: Dual-core Zeon (32-bit) •  2006: Quad-core Clovertown (32-bit) •  2008: Nehalem (64-bit, 4 cores)

Jun-3-10 CS210 420

Data Types

•  Integer ! 2’s complement ! Unsigned

•  BCD Integer (string of 4-bit digits stored in bytes) •  Packed BCD Integer (string of 4-bit digits) •  Floating point (IEEE standard) •  Bit string •  MMX

Jun-3-10 CS210 421

Integer:7S

0

15S

0

31 0S

Unsigned Integer:7 0

15 0

31 0

Floating Point:

S2231

exponent fraction63S

51 0

exponent fraction79 63 0S

exponent fraction

0S

Bit String:

MMX Data Type:

last bit bit 0

63 48 32 16 0

element 3 element 2 element 1 element 0

63 56 48 40 32 24 16 8 0

7 6 5 4 3 2 1 element 0

X + 1X + 2X + 3X + 4… address X

BCD Integer:

digit N

digitN

digitN – 1

digit3

digit 0digit 1digit 2

048121620

Packed BCD: 04812…

digit2

digit1

digit0

length of bit string

Also 64-bit!

Jun-3-10 CS210 422

Opcodes

•  Several hundred •  Usually one byte; sometimes two bytes •  Variable-length instructions •  Many formats •  Two operands: one may come from memory •  Many, inconsistent addressing modes

Jun-3-10 CS210 423

Instruction Fields

•  Prefix ! Indicates some form of modification of the instruction

•  Opcode •  Mode

! Indicates addressing mode(s) to follow

•  SIB (scale, index, base) ! Optional ! Indicates memory addressing information

•  Displacement ! Optional ! Indicates offset for memory address

•  Immediate ! Optional ! Contains value

Jun-3-10 CS210 424

Memory

Two models •  Flat address space: 2n bytes •  Segmented (2D) space

! Provided effective method for protection ! Still supported, rarely used

Jun-3-10 CS210 425

Jun-3-10 CS210 426