1 Nios II Processor Architecture and Programming CEG 4131 Computer Architecture III Miodrag Bolic.

29
1 Nios II Processor Architecture and Programming CEG 4131 Computer Architecture III Miodrag Bolic

Transcript of 1 Nios II Processor Architecture and Programming CEG 4131 Computer Architecture III Miodrag Bolic.

1

Nios II Processor Architecture and

Programming

CEG 4131 Computer Architecture III

Miodrag Bolic

2

Presentation Outline

• Basic description of Stratix Altera Devices• NIOS II processor architecture

– Review pipelining techniques– Review memory access techniques

• How to design a system using NIOS II processor

3

Stratix EP1S10 [2]

4

5

6

TriMatrix™ Memory [1]

M512 Blocks M4K Blocks M-RAMDedicated External Memory Interface

Look-Up Schemes Packet & Cell Buffering Cache

More Bits For Larger Memory Buffering

More Data Ports for Greater Memory Bandwidth

Small FIFOs Shift Register Rake Receiver

Correlator FIR Filter Delay Line

Header / Cell Storage Channelized

Functions ATM cell–packet

processing Nios Program Memory

Packet / Data Storage Nios Program Memory System Cache Video Frame Buffers Echo Canceller Data

Storage

512 bits per block + parity

4 Kbits per block + parity

512 Kbits per block + parity

7

Memory Bandwidth SummaryStratix Device Family [1]

Device Total RAM Bits

M-RAM Blocks

M4K Blocks M512 Blocks MaximumBandwidth

(Mbps)

EP1S10 920,448 1 60 94 1,245,024

EP1S20 1,669,248 2 82 194 2,096,928

EP1S25 1,944,576 2 138 224 2,894,400

EP1S30 3,317,184 4 171 295 3,750,192

EP1S40 3,423,744 4 183 384 4,384,800

EP1S60 5,215,104 6 292 574 6,762,528

EP1S80 7,427,520 9 364 767 8,784,720

8

9

Logic Element (LE) [2]

Sync Load & Clear Logic

DDATA

4-Input LUT

Register Control Signals

Register Chain Input

Register Chain Output

LUT Chain Output

data1

data2

data3

data4

cin

Row, Column & DirectLink

Routing

Local Routing

Note:1) Functional Diagram Only. Please See Datasheet for more Details.2) Addnsum & data1 connected via XOR logic

LUT Chain Input

Register Feedback

addnsub

(2)

10

Logic Array Blocks (LAB) [2]

• 10 LEs• Local Interconnect• LAB-Wide Control Signals

LE1

LE2

LE3

LE4

LE5

LE6

LE7

LE8

LE10

LE9

4

4

4

4

4

4

4

4

4

4

Control Signals

Lo

cal I

nte

rco

nn

ect

11

Presentation Outline

• Basic description of Stratix Altera Devices• NIOS II processor architecture

– Review pipelining techniques– Review memory access techniques

• How to design a system using NIOS II processor

12

13

NIOS II Overview [3]

• Soft IP Core– A soft-core processor is a microprocessor fully described in

software, usually in an HDL, which can be synthesized in programmable hardware, such as FPGAs.

• Reduced Instruction Set Computer (RISC)• No pipeline, 5 or 6 stages pipeline configurations• Full 32-bit instruction set, data path, and address space• 32 general-purpose registers• 32 external interrupt sources• Access to a variety of on-chip peripherals, and interfaces

to off-chip memories and peripherals• Software development environment based on the GNU

C/C++ tool chain and Eclipse IDE

14

NIOS II Scalability

• Powerful multiprocessing systems can be built

15

NIOS II Processor Core [3]

16

Implementation

• The functional units of the Nios II architecture form the foundation for the Nios II instruction set.

• The Nios II architecture describes an instruction set, not a particular hardware implementation.

• Trade-offs:– More or less of a feature - amount of instruction cache memory. – Inclusion or exclusion of a feature - the JTAG debug module. – Hardware implementation or software emulation - divider

17

Types of Processors

18

Memory Organization

19

Instruction and Data Cache

• Useful for high latency external memories

• Cache is direct mapped– Low address bits represent cache

line

• Use write-though policy– Data is written to external memory

as well as cache

• Problem– Instruction cache has 32 bytes per

cache line

– We choose cache size of 1024 bytes

– Question: How many bits are needed for the tag

20

Cache Performance

Memory I-Cache D-Cache Normalised Performance

SDRAM No No 40.2%

SDRAM No Yes 55.2%

SDRAM Yes No 64.3%

SDRAM Yes Yes 96.4%

OnChip No No 100.0%

OnChip No Yes 98.0%

OnChip Yes No 110.2%

OnChip Yes Yes 105.6%Performance relative to on chip RAM with no Cache running dhry.c modified for unbuffered I/O

Memory I-Cache D-Cache Normalised Performance

SDRAM No No 40.2%

SDRAM No Yes 55.2%

SDRAM Yes No 64.3%

SDRAM Yes Yes 96.4%

OnChip No No 100.0%

OnChip No Yes 98.0%

OnChip Yes No 110.2%

OnChip Yes Yes 105.6%

21

Tightly Coupled Memory

• Fast data buffers • Fast sections of code • Fast interrupt handler • Critical loop • Constant access time; guaranteed not to have arbitration

delays • Up to 4 tightly coupled memories

• Software Guidelines – Software accesses tightly-coupled memory addresses just like

any other addresses. – Cache operations have no effect when targeting tightly-coupled

22

Pipelining

• Static branch prediction is implemented using the branch offset direction; – a negative offset is predicted as taken– a positive offset is predicted as not-taken

23

24

Presentation Outline

• Basic description of Stratix Altera Devices• NIOS II processor architecture

– Review pipelining techniques– Review memory access techniques

• How to design a system using NIOS II processor

25

26

Hardware Abstraction Layer (HAL) [4]

• Isolates the application software from hardware modifications.

• Applications are device-independent because they abstract information from such systems as: – Character mode devices: UART core, JTAG UART core, LCD

display controller– Flash memory devices– Timer devices– DMA controller core– Ethernet MAC/PHY Controller

• HAL application program interface (API) is integrated with the ANSI C standard library.

27

Layers of HAL API [4]

• HAL library generatioin:1. SOPC Builder generates a hardware system

2. Nios II IDE generates a custom HAL system library to match the hardware configuration

• Changes in the hardware configuration automatically propagate to the HAL device driver configuration

• NIOS II is programmed in C

28

Programming NIOS II Processor [4]

• Programming UART– Standard Input, Standard Output routines in C

---------------------------------------------------#include <stdio.h>#include <string.h>

int main (void){

char* msg = “hello world”;FILE* fp;fp = fopen (“/dev/uart1”, “w”);if (fp){

fprintf(fp, “%s”,msg);fclose (fp);

}return 0;

}

---------------------------------------------------

29

References

1. Altera Corp., Stratix & Stratix II Module 3: Using TriMatrix Memories, 2004

2. Altera Corp., Stratix Module 2: Logic Structure & MultiTrack Interconnect, 2004.

3. Altera Corp., Nios II Processor Reference Handbook, 2005.

4. Altera Corp., Nios II Software Developer's Handbook, 2005.