oer.uam.edu.ng · Web viewIn addition to these components, many others make it possible for the...

LECTURE NOTE ON

EEE556MICROCOMPUTER HARDWARE AND

SOFTWARE TECHNOLOGY

Course outline

a. Computer type and designb. Bus organisation of a computerc. Memory/memory hierarchyd. Registere. Cachef. Hard diskg. Main memoryh. Architecturesi. Instruction execution/ addressing modesj. Assembly language programming

2

EEE 556 lecture note

COMPUTER TYPE AND DESIGN

Introduction

A Computer is a programmable machine. The two principal characteristics of a computer are:

i. It responds to a specific set of instructions in a well-defined manner. ii. It can execute a pre-recorded list of instructions (a program).

Modern computers are electronic and digital. The actual machinery wires, transistors, and circuits are called hardware. The instructions and data are called software.

All general-purpose computers require the following hardware components:

Memory: Enables a computer to store, at least temporarily, data and programs.

Mass storage device: Allows a computer to permanently retain large amounts of data. Common mass storage devices include disk drives and tape drives.

Input device: Usually a keyboard and the mouse are the input device through which data and instructions enter a computer.

Output device: A display screen, printer, or other device that lets you see what the computer has accomplished is the output device.

3


Central processing unit (CPU): The CPU is referred to as the heart of the computer system. This is the component that actually executes instructions.

In addition to these components, many others make it possible for the basic components to work together efficiently. For example, every computer requires a bus that transmits data from one part of the computer to another.

A better approach to define a computer is by the function it performs. A computer can be defined as a system consisting of a set of four interrelated components- input, storage, process and output. The input unit accepts data and instructions from user. The storage unit stores them. The process unit actually manipulates the data according to the instructions and generates results; the results are available from the output unit.

Inside the system unit box we can fine the following major hardware parts of a pc

1. Motherboard with processor (also) called as CPU or microprocessor, RAM (Random Access Memory), Chips, EPROM (Erasable Programmable Only Memory) or flash memory chip, chipset and expansion slots

2. Daughter boards, plugged into expansion slots.3. Switch mode power supply (SMPS)4. Disk drives (floppy drives, hard drives, CD-Rom drives etc5. Speakers, cable and connection

4


The various input/output devices and storage devices of PCs are usually refer to as peripheral devices, thus PC consist of six functional; units

a. Control unitb. Arithmetical and logic unit (ALU)c. Main memory unitd. Input unite. Output unit andf. Secondary storage unit.

INPUT

All input devices must provide the system unit with data in suitable binary format that can be easily interpreted by the processor and can be stored in the main memory. The functions performed by an input unit can be summarized as follows

5


a. An input device accepts the sequence of instructions and data from outside world

b. It converts these instructions and data into a format acceptable by the processor

c. It sends the converted instruction and data to the processor for further execution.

MAIN MEMORY

The main memory consist of ROM and RAM, the latter is treated as the CPUs workbench for user program whenever execution command for a program or application is issued by the user, a copy of the program gets loaded into the main memory from the secondary storage disk or tape. This is because the CPU cannot directly access instructions and data from a disk; it can only access the main memory and cache memory. Naturally if main memory capacity is too limited, performance of a system suffers. As the memory capacity of a system is increased, system performance usually improves.

The memory hierarchy

• How much?

If the capacity is there, applications will be developed to use it.

• How fast?

To achieve performance, the memory must be able to keep up with the processor.

• How expensive?

For a practical system, the cost of memory must be reasonable in relationship to other components

There is a trade-off among the three key characteristics of memory: cost, capacity, and access time.

• Faster access time – greater cost per bit

• Greater capacity – smaller cost per bit

• Greater capacity – slower access time

6


As one goes down the hierarchy:

a. decreasing cost per bit; b. increasing capacity;c. increasing access time;d. Decreasing frequency of access of the memory by the processor.

Thus smaller, more expensive, faster memories are supplemented by larger, cheaper, slower memories.

CENTRAL PROCESSING UNIT

The PC regulates operation of the complete machine through its two parts- the control unit and the Arithmetic and logic unit

a. Control unit: it coordinates operation of the entire system by selecting, interpreting and monitoring the execution of program instruction. It does not perform any actual processing of data, the control unit fetches

7


instructions one by one from the main memory, stores it temporary in an instruction register, interprets the instruction and accordingly generates various control signal

b. Arithmetic and logic unit (ALU): it is the place where actual processing of data takes place. All arithmetic operation and logical operation are carried out in the ALU.

c. Output: it provides users with desired information and result of computation. The function of the output are;i. It accepts the result produced by the processor in binary form.ii. It converts these coded results into human readable form.iii. It sends the converted result to the outside world

External or secondary storage unit: they provide large volatile storage capacity to a computer system, but they are relatively slow compared to the main memory unit e.g. hard disk, floppy diskettes, CD-Rs.

OVER VIEW OF THE COMPUTER SYSTEM

Motherboards

It provides the data transfer connection between the following

i. CPU (main processor)ii. RAM (main memory)iii. AGP and PCI extension cardsiv. Disk drives and external peripherals

Chipset

Most important aspect of motherboard is its core chipset, chipset functions divided into 2 groups

i. Northbridge chip – adjacent to CPUii. Southbridge chip – adjacent to peripherals

Northbridge

- Key performance area of motherboard- Handles mission –critical jobs- RAM –most important

8


i. Deals with requests for data transfer to/from RAM for CPU, AGP and Southbridge

ii. Keeping these supplied with data is core function of Northbridge

- CPU –data transfer to and from processor- AGP –data transfer to and from graphics card

Southbridge

- Provides support for wide variety of devices. Each may have differing bus speeds and designs.

- Examples of secondary busesi. USB –scanners, cameras etc.ii. IDE –Hard Disk, CD/DVD drivesiii. PS/2 –keyboard, mouse

- More and more being integrated onto the Southbridge eliminates need for expansion cards, e.g. audio Ethernet

9


PROCESSOR EVOLUTION

Intel 8086(1978): 8086 marked a significant jump in the processor technology. It was the first processor to be made available in multiple clock speeds. It started with 4.77MHz, scaled to 8MHz and then 10MHz. The 8086 processor featured a 16-bit data bus and could address 1MB memory. The number of transistor in this processor shot up to an all time high of 29,000, but that is nothing compared to today’s standards. IBM used it for PCs, but later switched to cheaper 8088MP

BUS ORGANISATION OF A COMPUTER

The CPU has to be able to send various data values, instructions and information to all the devices and components inside the computer as well as the different peripherals and devices attached. At one time, ‘bus’ meant an electrically parallel system, with electrical conductors similar or identical to the pins on the CPU. This is no longer the case and modern systems are blurring the lines between buses and networks.

A computer bus can be divided into two types, internal and external.

The internal bus connects the different components inside the case: the CPU, system memory and all other components on the motherboard. It’s also referred to as the system Bus.

The external Bus connects the different external devices, peripherals, expansion slots, I/O ports and drive connections to the rest of the computer. In other words, the external Bus allows various devices to be added to the computer’s capabilities. It is generally slower than the system bus, another name for the external bus is the expansion bus.

The bus is just a bunch of tiny wires (traces and electronic pathways). One bunch carries info around to the different components on the motherboard, and another bunch of wires connects these components to the various devices attached to the computer.

They are 3 major buses

i. Address busii. Data bus

10


iii. Control bus

These buses vary from processor to processor. However, each bus carries comparable information on all processors e.g. the data bus may have a different implementation on the 8086 than on the 8086, but both data between the processor, I/O and memory.

THE ADDRESS BUS

The data bus on a processor transfer information between a particular memory location or I/O device and the CPU. The only question is, “which memory or I/O device?” the address bus answers that question. To differentiate memory locations and I/O devices, the system designer assign a unique memory address to each memory element and I/O devices. When the software wants to access some particular memory location or I/O device, it places the corresponding address on the address bus. Circuitry associated with the memory or I/O device recognizes this address and instructs the memory or I/O devices to read the data bus. In either case, the entire device whose address matches the value on the address bus responds.

THE DATA BUS

This bus defines the “size” of the processor. Every modern CPU from the Pentium on up employs a 64-bit wide data bus, some of the earlier processor used 8-bit, 16-bit, or 32-bit data buses. These buses link the CPU to the I/O ports and the memory, transferring information between a particular memory location or I/O devices and the CPU. The data bus is bi-directional by 2 way headed arrows. Any device output connected unto the data bus must be 3 states so that they can be floated except when the device is been addressed or read from the data bus.

Summary

Bus is a set of wires used for data transfer among the component of a computer system. A bus is essentially a shared highway that connects different parts of the system including the central processing unit (CPU), disk-drive controller, memory and input/output ports and enables them to transfer information usually controlled by a microprocessor. One group of wires for example, carries data, another carries the address (location where specific information can be found, yet another carries control signals to ensure that the different parts of the system

11


use their shares highway without conflict. Buses are characterised by the number of bits they can transfer at a single time. A computer with an 8-bit data bus for e.g., transfer 8 bits of data at a time, each bus has a clock speed measured in MHz.

LEVELS OF COMPUTER DESIGN

They are 3 levels of computer design

i. Determination of the architectureii. Determination of the implementation/organisationiii. Determination of hardware/realization

ARCHITECTURE OF A COMPUTER

In computer science and computer engineering, computer architecture or digital computer organisation is the conceptual design and fundamental operational structure of a computer system. It’s a blue print and functional description of requirements and design implementations for the various parts of computer, focusing largely on the way by which the central processing unit (CPU) performs internally and accesses addresses in memory. It may also be defined as the science and art of selecting and interconnecting hardware components toz create computer that meet functional, performance and cost goals.

Computer architecture comprises at least three main subcategories

i. Instruction set architecture or ISA, is the abstract image of a computing system that is seen by a machine language programmer, including the instruction set, word size, memory address modes, processor registers.

ii. Micro-architecture, also known as computer organisation is lower level, more concrete and detailed, description of the system that involves how the constituent parts of the system are interconnected and how they interoperate in order to implement the ISA

12


Note

-the instruction set architecture (ISA) is the interface between the software and hardware.

-it is the set of instructions that budges the gap between high level language and the hardware.

For a processor to understand a command, it should be in binary and not in high level language. The ISA encodes the values

-the ISA also defines the items in the computer that are available to a programmer.

PERFORMANCE

Computer performance is often described in terms of clock speed (usually in MHz or GHz). This refers to the cycle per second of the main clock of the CPU. However this metric is somewhat misleading, as a machine with a higher clock rate may not necessarily have higher performance. Computer performance can also be measured with the amount of cache a processor has. If the speed, MHz or GHz, were to be a car then the cache is like the gas tank, thus the higher the speed, the greater the cache.

Factor that influence performance include

a. Functional units.b. Bus speeds.c. Available memory.d. Type and order of instructions in the program being run.

13


GENERIC ARCHITECTURE OF A MICRO PROCESSOR

The great revolution in processing power arrived with the 16-bit 8086 processor. This had a 20-bit address bus and a 16-bit address bus, whereas the 8088 has an 8-bit external data bus. The Figure below shows the pin connections of the 8086 and also the main connections to the processor.

8086 connection

The processor either communicates directly with memory or communicates with peripherals through isolated I/O ports.

Processing is cooperate process between two entities

i. Memoryii. Processor (micro processor)

The memory is external to the processor. Before an instruction or data is executed they must be a fetch of data from the memory. The program counter (register) contains the address to be executed. It tells the processor where to fetch instructions. After executing the instruction it increment another instruction still from the memory through the address bus.

The instruction to be executed is divided into two parts;

i. Op-code: contains the instructions to be executed. It tells the processor what to do.

ii. Operand: contains the address of the instruction to be executed.

Registers

Microprocessors use registers to perform their operations. These registers are basically special memory locations in that they are given names. The 8086/88 has 14 registers which are grouped into four categories, as illustrated below:

14


General-purpose registers

There are four general-purpose registers which are AX, BX, CX and DX. Each can be used to manipulate a whole 16-bit word or with two separate 8-bit bytes. These bytes are called the lower and upper order bytes. Each of these registers can be used as two 8-bit registers; or example, AL represents an 8-bit register which is the lower half of AX and AH represents the upper half of AX. The AX register is the most general purpose of the four registers and is usually used for all types of operations. Each of other registers has one or more implied extra functions:

• AX is the accumulator.

The accumulator is an 8-bit register that is a part of arithmetic/logic unit (ALU). This register is used to store 8-bit data and to perform arithmetic and logical operations. The result of an operation is stored in the accumulator. The accumulator is also identified as register A. It is used for all input/output operations and some arithmetic operations. For example, multiply, divide and translate instructions assume the use of AX.

• BX is the base register. It can be used as an address register

• CX is the count register. It is used by instructions which require counting. Typically it is used for controlling the number of times a loop is repeated and in bit shift operations.

• DX is the data register. It is used for some input/output and also when multiplying and dividing.

15


8086 registers

Addressing registers

The addressing registers are used in memory addressing operations, such as the holding the source address of the memory and the destination address. These address registers are named BP, SP, SI and DI, which are:

• SI is the source index. This is used with extended addressing commands.

• DI is the destination index. The destination is used in some addressing modes.

• BP is the base pointer.

• SP is the stack pointer. The stack pointer is also a 16-bit register used as a memory pointer. It points to a memory location in R/W memory, called the stack. The beginning of the stack is defined by loading 16-bit address in the stack pointer.

16


Status registers

Status registers are used to test for various conditions in an operation, such as ‘is the result negative’, ‘is the result zero’, and so on. The two status registers have 16 bits and are called the instruction pointer (IP) and the flag register (F):

• IP is the instruction pointer. The IP register contains the address of the next instruction of the program.

• Flag register. The flag register holds a collection of 16 different conditions.

A segment registers

There are four areas of memory called segments, each of which are 16 bits and can thus address up to 64KB (from 0000h to FFFFh). These segments are:

• Code segment (CS register) – defines the memory location where the program code (or instructions) is stored.

• Data segment (DS register) – defines where data from the program will be stored (DS stands for data segment register).

• Stack segment (SS register) – defines where the stack is stored.

• Extra segment (ES).

17


. Data registers are normally used for storing temporary results that will be acted upon by subsequent instructions

.Each of the registers is 16 bits wide (AX, BX, CX, and DX)

18


.General purpose registers can be accessed as either 16 or 8 bits e.g. AH: upper half of AX, AL: lower half of AX.

The registers in this group are all 16 bits wide, low and high bytes are not accessible. These registers are used as memory pointers.

.examples: MOV AH, [SI]

Move the byte stored in memory location whose address in contained in register SI to register AH.

IP is not under direct control of the programmer

Memory segmentation

Memory addresses are normally defined by their hexadecimal address. A 4-bit address bus can address 16 locations from 0000b to 1111b. This can be represented in hexadecimal as 0h to Fh. An 8-bit bus can address up to 256 locations from 00h to FFh. A memory location is identified with a segment and an offset address and the standard notation is segment: offset. A segment address is a 4-digit hexadecimal address which points to the start of a 64 kB chunk of data. The offset is also a 4-digit hexadecimal address which defines the address offset from the segment base pointer. This is illustrated in the Figure below

19


Memory addressing

The segment: offset address is defined as the logical address; the actual physical address is calculated by shifting the segment address; 4 bits to the left and adding the offset. The example given next shows that the actual address of 2F84:0532 is 2FD72h:

Segment (2F84): 0010 1111 1000 0100 0000

Offset (0532): 0000 0101 0011 0010

---------------------------------

Actual address: 0010 1111 1101 0111 0010

The physical address gives the location of data/instruction in the memory. If for instance a data segment contains an address of 5367. Compute its physical address and place it in memory if its offset is 34FC.

Physical address = segment address ×10 Hz+¿ offset

= DS ×10+¿ offset

= 5367 ×10+34 F 6

= 53670 + 34F6 = 56B63

I.e. the data segment is made of 16bits of segment address.

Recall each digit in the register represents 4bit. To meet up the standard of 8086 (20bit) 4bit offset is added

Eg2. Place 3852 in memory if the address of the data segment is 7280 and the offset is CBDC.

20


Physical address = segment address × 10Hz + offset

= 7280 × 10 + CBDC

= 72800 + CBDC

= 7F3DC. This value aids in the placing of 3852 in the memory

Every processor should have at least 4 resources

i. Computational resources: ALUii. Internal memory resources: caches, registersiii. Communication resources: busiv. Control resources: timing, control

Characteristic of 8086

i. Segmented memoryii. It uses small endian organisationiii. It has two memory bankiv. It is byte organised v. it addresses using 2 64 bit registervi. Each of the segmented register has 16bit while the offset register has

4bit.

Features of 8086

- 8086 is a 16bit processor. It’s ALU, internal registers works with 16bit binary word

- 8086 has a 16bit data bus. It can read or write data to a memory/port either 16bits or 8 bit at a time

- 8086 has a 20bit address bus which means, it can address up to 220

= 1MB memory location

- Frequency range of 8086 is 6-10 MHz

21


The computing power of a microprocessor is specified by the speed it takes to process a data.

Bus organisation

Whether the bus organisation of the microprocessor is 2 buses organised or 1 bus organised, we have two memory architectures;

i. Harvard: having a separate memory space for data and another for instruction

ii. Von Neumann: we have data and instruction combined in one memory space.

Factor affecting the speed and performance of the CPU

i. The speed is determined whether the processor is from Harvard or von Neumann architecture. Therefore Harvard has: higher speed.

ii. It also depends on the size of the internal resistance iii. Also on the number of caches in the internal memory. iv. It also depends on the number of executing unit the memory has. v. It also depends on whether the processor is pipelined or not pipelined.vi. It depends on the number of CPU on the chip.vii. It depends on the manner in which processing is done whether

sequential or random.

Addressing modes

The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access variables, arrays, records, pointers, and other complex data types. Addressing modes are various ways in which data or information is specified or ways we retrieve data or information from the computer memory. The instructions MOV B, A or MVI A, 82H are to copy data from a source into a destination. In these instructions the source can be a register, an input port, or an 8-bit number (00H to FFH). Similarly, a destination can be a register or an output port. The sources and destination are operands.

22


The various formats for specifying operands are called the ADDRESSING MODES. For 8086, they are:

a. Displacement addressing modesb. The direct addressing modesc. Register indirect addressing modesd. Base/index addressing modese. Base-index addressing modesf. Base-index with displacement addressing modes

A. DISPLACEMENT ADDRESSING MODES

The most common addressing mode and the one that's easiest to understand is the displacement-only (or direct) addressing mode. The displacement-only addressing mode consists of a 16 bit constant that specifies the address of the target location and data is contained in the instruction. E.g. MOV BX, 543Fh, the instruction reads move the data 543Fh to the base register. This type of instruction is called immediate because the data is contained in the instruction. We can represent this in the memory as shown below.

23


Ex. MOV AX, 20h

i. What is the addressing modeii. Place the instruction in the memory

Ans.

The addressing mode is displacement since it is coming with an address and instruction. Since the 8086 is byte organised

AX – 16 bit = AH+AL (AH~msb, AL~lsb)

/ 00 / 20

Ex.2 MOV CL, 25H

24


B. DIRECT ADDRESSING MODE

In this addressing mode the instruction carries the location. In this case the displacement is an offset in the location where the data is stored.

E.g. MOV AL, [43A6] if the data segment contains 5367

Soln.

First compute the physical address

PA = SR × 10 + offset

= 5367 × 10 + 43A6

= 57A16H

C. REGISTER INDIRECT ADDRESSING MODE

The 80x86 CPUs let you access memory indirectly through a register using the register indirect addressing modes. There are four forms of this addressing mode on the 8086, best demonstrated by the following instructions:

MOV AL, [BX]

MOV AL, [BP]

MOV AL, [SI]

MOV AL, [DI]

As with the x86 [BX] addressing mode, these four addressing modes reference the byte at the offset found in the BX, BP, SI, or DI register, respectively. The [BX], [SI], and [DI] modes use the DS segment by default. The [BP] addressing mode uses the stack segment (SS) by default.

Intel refers to [BX] and [BP] as base addressing modes and BX and BP are base registers (in fact, BP stands for base pointer). Intel refers to the [SI] and [DI] addressing modes as indexed addressing modes (SI stands for source index, DI stands for destination index). However, these addressing modes are functionally equivalent.

25


E.g. MOV AX,[BP]: BX is enclosed in a bracket these shows that BX does not contain the data but offset to the data.

Therefore PA = SS × 10H + [BP]

Note BP is the offset.

Class work

Calc. The physical address if the stack segment contains 5346 and the offset contains 3ABC

Soln.

PA = 5340 × 10H + 3ABC

= 53400 + 3ABC

= 56EBC

D. BASE INDEX ADDRESSING MODE

In this two addressing modes we compute the offset from the two values i.e. content of the base register and the content of the index register, following the address field of instruction.

26


EA (effective address) = [BX] + [SI] or [BP] + [DI]

= [BX] + [DI]

= [BP] + [SI]

E.g. MOV AX, [BX][DI]

soln.

EA = [BX] + [DI] → offsets

PA = DS × 10H + [BP] + [DI]

Note: The based indexed addressing modes are simply combinations of the register indirect addressing modes. These addressing modes form the offset by adding together a base register (BX or BP) and an index register (SI or DI). The allowable forms for these addressing modes are

MOV AL, [BX][SI]

MOV AL, [BX][DI]

MOV AL, [BP][SI]

MOV AL, [BP][DI]

Suppose that BX contains 1000h and SI contains 880h. Then the instruction

MOV AL, [BX][SI] would load AL from location DS:1880h. Likewise, if BP contains 1598h and DI contains 1004, MOV AX, [BP + DI] will load the 16 bits in AX from locations SS: 259C and SS: 259D.

The addressing modes that do not involve BP use the data segment by default. Those that have BP as an operand use the stack segment by default.

27


You substitute DI in the figure above to obtain the [BX + DI] addressing mode.

You substitute DI in the figure above for the [BP + DI] addressing mode.

E. BASED INDEXED PLUS DISPLACEMENT ADDRESSING MODE

These addressing modes are a slight modification of the base/indexed addressing modes with the addition of an eight bit or sixteen bit constant. In this addressing mode, the effective address is computed from three values, the base, index and the displacement.

E.g. MOV AX, 1008H, [BP][DI]

EA = offset = 1008 + [BP] + [DI]

PA = CS × 10 + 1008 + [BP] + [DI]

The following are some examples of these addressing modes:

MOV AL, DISP [BX][SI]

MOV AL, DISP [BX + DI]

MOV AL, [BP +SI +DISP]

MOV AL, [BP][DI][DISP]

28


You may substitute DI in the figure above to produce the [BX + DI + DISP] addressing mode.

You may substitute DI in the figure above to produce the [BP + DI + DISP] addressing mode.

Suppose BP contains 1000h, BX contains 2000h, SI contains 120h, and DI contains 5. Then MOV AL,10h[BX + SI] loads AL from address DS:2130; MOV CH,125h[BP + DI] loads CH from location SS:112A; and MOV BX,CS:2[BX][DI] loads BX from location CS:2007.

MODEL OF A CPU

Before a CPU process a data it request instruction from main memory, but because the memory is an external device there is need for cache which has a low access time compare to the memory. Computers that are pipelined can work with one result at a time it requires several clock circle to execute an instruction which is done in parallel.

Microprocessor program

Apart from data manipulation the main processor can be used for the control of other machine it can be used for the control of other machine. It can be used for control of speed of other machine and before it control this speed, data has to collected, after words the speed of the RAM is noted and the two is compared. Thus a decision has to be taken based on the comparison of the speed of RAM and that of the machine. If the speed is correct no further decision is taken but if the speed is incorrect the following decisions and questions are asked;

29


-is the speed high or low: all this result in the control unit sending out signal adjusting the speed of the machine. This can be completed in fraction of a second.

The strength of the microprocessor is determined by the ability of the processor to repeat control sequence at a very high rate. It monitors a situation and takes action in milliseconds if the need arise. Sometimes it takes complex/simple decision depending on how it is completed.

Programming in micro processing involves two stages

a. Determining interrelated stagesb. Writing the program

The stages can be determined by drawing flow charts

Decision making

It determines the number of alternation part to be followed

Input/output

Start/end button

Connector

It represents an exist to or entry from another part of flow chart

It involves the processing of an instruction or result

from the processing of an instruction

Classification of instructions in the instruction set

30


The instruction of an instruction set is divided into two(2) categories

1. Data transfer instruction. This instruction can be:a. General purpose byte or word transfer e.g. move, pusa, pop, pusha,

popa, xchg.b. Sinple input/output port transfer instruction e.g. in, out.c. Special address transfer instruction e.g. lea- load effective address of

oerand to a specific register. Lds- load ds register and other specified registers from memory.

d. Flag transfer instruction: e.g. lahf- load AH with the low byte of the flag register, sahf- stere AH with the low byte of flag register

2. Data transformation instructiona. Arithmetic instruction

i. ADDITION- ADD, ADC, INC, AAA, DAA.ii. SUBTRACTION- SUB, SBB, DEC, AAS, DASiii. MULTIPLICATION- MUL, IMUL, AAMiv. DIVISION- DIV, IDIV, AAA, CBW, CWD

b. Bit manipulation instructioni. Logical instruction: OR, NOT, AND, XOR, TEST.ii. Shift instruction: SHL/SAL, SHR/SAR.iii. Rotate instruction: ROL, ROR, RCL, RCR.

3. Transfer of control instructionThis instruction can be regarded as the program execution transfer instruction. We use this kind of instruction to tell the microprocessor to start fetching instruction from some new address rather continuing in sequence. This instruction can be

i. Unconditional transfer instruction such as the JMP, CALL, RETii. Conditional transfer e.g. JNBE/JA, JAE/JNBiii. Interrupt instruction e.g. INT, IRET, INTO.iv. Iteration control instruction e.g. LOOP, loope/loopz, jcxz,

lodpne/loopzev. High level language interface instructions e.g. ENTER, LEAVE,

BOUND.4. Miscellaneous instruction

This instruction cannot fit into any of the class above. They include strain instruction which involve moving large volume of data in strings. A string is a series of bit or word in sequencial memory location which often consist of ASSI character codes.

31


i. The external hardware synchronization instruction. (HLT, WAIT, ESC, LOCK)

ii. Processor control instruction: This instruction includes the flag set/clear instruction.

iii. No-operation instruction: no action takes place except fetch and decode

Flag

A flag generally is a flip flop that indicates some condition produced by the execution of an instruction or the flag at the same line controls the operation in the execution unit. In the 8086, a 15 bit register is contained in the EEU and only 9 are active and out of these 9 active flag 6 are used to indicate some indication produced by the instruction. For e.g. the flip flop called CARRY could be set to one if the addition of two 16-bit numbers produce a CARRY-OUT of the MSB. If they is no carry out of the MSB produced by the addition then the carrying flag will indicate a zero. This 6 condition flag in this group are

i. Carry flag, CFii. Penty flag, PFiii. Auxiliary carry flag, AFiv. Zero flag, ZFv. Sign flag, SFvi. Over flow flag, OF

The remaining three flag in the flag register are use to control certain operation of the processor. These three flags are different from the other six flag from the way they are set/reset. The six conditional flag are set/reset by the EU on the basis of the result of the arithmetic or logical operation while the other three control flag are deliberately set/reset with specific instruction you put in your program. The three control flags are

i. the trap flag use to single stepping through a programii. interrupting flag IF, as the name implies, it is used to allow or

inhibit(stop) interaction of a programiii. direction flag DF, this one is used to restrain instruction

U U U U O DF IF TR SF ZF U AF U PF U CF

32


F

CACHES

A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations. As long as most memory accesses are cached memory locations, the average latency of memory accesses will be closer to the cache latency than to the latency of main memory. When the processor needs to read from or write to a location in main memory, it first checks whether a copy of that data is in the cache. If so, the processor immediately reads from or writes to the cache, which is much faster than reading from or writing to main memory.

Design of cache

a. Direct mapped cache is a cache where a block of memory can placed in one and only one direction of the cache. Also in the Direct mapped cache a number line from a page in main memory will always be copied to the same number line in the cache. In essence to transfer from memory to cache the data is grouped onto a block and transfer into a cache using a cache line. A cache is characterised by 2 parameters, we have the code capacity and the cache line capacity. E.g. If cache capacity is 8kB and the line capacity of the cache line is 64bit, Determine the number of lines in the cache.

Cache line capacity = 8k byteLine capacity = 64 bits

33


Line in the cache = cachecapacityline capacity

= 8k byte64 bit

= 8×1024 × 864 = 1024

= 210

The number of bit for which each of these lines were address is given as 2X = 210

x = 10bits

Hence each of the 1024 lines is addressed using 10 bit thus the number of bits which can be used to address the cache with the same cache capacity of 8k byte and a line capacity of 16 bit is 9 bit.In conclusion, the cache contains 29 lines and each of the lines is addressed using 9 bits. Comparing the result for the same cache capacity, if the line capacity increases the number of lines decreases. Also we can deduce that the cache with 512 lines has higher spatial locality since they is tendency of executing 16 bit sequentially. Note that the cache with higher line capacity has higher spatial locality on the other hand a cache with 1024 line has higher temporal locality because it execute more cache line compare to the cache with 512 lines.

Set associative cacheThe direct mapped approach cache has some difficulty of a program makes use of the same number of lines from the two memory at the same time there is a swapping back and forth between the main memory and cache as it executes. This swapping is called thrashing. Schemes that can help avoid thrashing is the two ways set associative approach. In this approach two separate caches and two cache directories are set up so that the same line from two different pages can be cache at the same time. Each cache in this case is half the size of the direct mapped cache. This approach produces a greater heat rate than the direct mapped cache. E.g. system that uses this method to minimize wait state is 25 MHz Compaq Deskpro 386/25

Fully associative system (four way)

34


In this system, a 4bit block or line form main memory can be written in any location in the cache. It has the advantage of; it can hold the same number lines from several pages at the same time, it has the disadvantage that the upper bit of each memory address sent out by the processor must be compared with all of the tags in the directory to see if that line is present in the cache. This is time consuming when this type of cache is full; some algorithm is used to determine which line in the cache is written over. The most common algorithm replaces the least recently used lines with a new line.

ExampleA direct mapped cache is to be design to work with 32Gbyte from main memory. A cache capacity is the 16Kbyte and the cache line capacity is 16bit, if the cache design consists of a tag, a line and bit field, design the cache.

SolutionMemory capacity = 32GBCache capacity = 16kBLine capacity = 16 byteLine field = set fieldMemory capacity = 32GB = 25.230 byteHence main memory capacity = 235 byteFirst is to find the number of cache or cache location in the main

memory = main memory capacitycache capacity

= 235/214

=235-14 = 221 caches21 bit is required to address each of the caches in the main memory and forms our tag field∴tag = xi .e . 2x = 221

X = 21bitTag field is equal to the main number of bit required to address each of the caches in the main memory

35


Next is to calculate the number of lines in each cache = cachecapacityline capacity

= 214/24

= 210

Therefore 10 bit are required to address each of the line and it forms the line field or set fieldNext is to find the byte fieldLine capacity = 16 bytes 2X = 24 byte X = 4

36


The direct mapped cache has the following disadvantage

If a program is to be transfer from the main memory using same number lines from two memory location at the same time. There will be swapping of the two location back and front between main memory and the cache. This causes what will call thrashing, so to avoid thrashing during transfer of data from main memory to the cache.If a cache is to be design using the same parameters as above and it has a set of 4, design the cache

Steps

Calculate the number of lines in main memory = memory capacitycache line capacity

= memory capacitycache linecapacity

=231

Calculate the number of lines in cache = cachecapacitycache linecapacity

= 210 linesCalculate the number of lines in each of the four setSet in this case = 4

This is equal to number of lines∈each cacheset

= 28

= 1024

37


It can be seen that 8 bit is required to address cache line in the set

Calculate the number of SET in main memorynumber of lines∈main memory

nu mber of lines∈aset= 223 SETIn this case our tag field = 23Set = 10Byte = 4

If we increase the number of set associatively you will discover that set decrease while tag increases this approaching a fully associative in a fully associative cache. A block is transferred from the memory to the cache.

38


References

Andrew C. and Mark A. (1983). "28. The One Megahertz bus" (zipped PDF). The Advanced

User Guide for the BBC Microcomputer. Cambridge, UK: Cambridge

Centre. pp. 442–443.

Barton E. and Robert S. (1961). "A New Approach to the Functional Design of

a digital Computer", Proceedings of the Western Joint Computer

Conference, pp. 393–396.

Bell, C. Gordon and Newell, Allen (1971). "Computer Structures: Readings and

Examples", McGraw-Hill.

Blaauw, G.A., and Brooks, F.P., Jr.(1964), "The Structure of System/360, Part

I-Outline of the Logical Structure", IBM Systems Journal, vol. 3, no. 2,

pp. 119–135.

“Caches and how the computer memory can be modelled”, Yahoo! Answers content

“Levels of the computer design”, Encyclopaedia

‘Flash memory as a replacement for hard drives and flags,“, Wikipedia.

Peer to Patent review page for "Systems and Method of use of flash memory

39


over hard drives".

Phillip A. (2001). Dictionary of Computer Science, Engineering, and

Technology. CRC Press. pp. 94–95.

“Data bus, address bus and control bus”. Wikipedia. http://www.wikipedia.com/computerbuses

“Types of buses”. Encyclopaedia.

Peer to Patent review page for "System and Method to Support Use of Bus Spare Wires in Connection Modules"

http://en.kioskea.net/contents/pc/bus.php3

http://www.wikipedia.com/computerbuses

oer.uam.edu.ng · Web viewIn addition to these components, many others make it possible for the...

Documents

Transcript of oer.uam.edu.ng · Web viewIn addition to these components, many others make it possible for the...