FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2....

44
By Gandhi Puvvada This is in continuation to the EE201L lecture on Synchronous 1-clock and 2-clock FIFOs (FIFO_1 lecture). To go through this, it is not necessary to go through the FIFO_2 lecture which deals with width & depth expansion of FIFOs. However please do go through GRAY_1 related to GRAY Code counter design and GRAY to BINARY code conversion. FIFOs with BRAMs (FIFO_3) EE 560

Transcript of FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2....

Page 1: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

By

Gandhi Puvvada

This is in continuation to the EE201L lecture on Synchronous 1-clock and 2-clock FIFOs (FIFO_1 lecture).

To go through this, it is not necessary to go through the FIFO_2lecture which deals with width & depth expansion of FIFOs.

However please do go through GRAY_1 related to GRAY Code counter design and GRAY to BINARY code conversion.

FIFOs with BRAMs (FIFO_3)

EE 560

Page 2: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

1. Statement of the problem

2. Quick review of the FIFO_1 2-clock FIFO

3. BRAMs _ info from Xilinx

4. Impact of I_Reg & O_Reg on when to increment WP & RP and when to conveythem to the other side.

5. Consumer design. When (and how often) you can activate REN

6. FWFT

7. Implementation details

CONTENTS

Page 3: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

In both FIFO_1 and FIFO_2 lectures (and slide-sets)we have assumed (for convenience) that the FIFO storage is made up of an ARRAY of REGISTERS.

We all know that, for large FIFOs, we can't afford anArray of Registers. It has to be a memory structure.

It has to be a SSRAM (or a BRAM in the case of a FPGA).

In this lecture, we discuss what we need to do to account for (a) the IReg (Input Register) in the case of a Flow-Through BRAM and (b) the IReg (Input Register) and the OReg (Output Register) in the case of a Pipelined BRAM.

Page 4: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question #1 of EE560 Final Exam of Summer 2012 is related to the current discussion. So we will review the question and the solution as part of this lecture.

First let us review the 2-clock FIFO from the FIFO_1 lecture.

FULL and EMPTY:

In a single-clock FIFO we have two options. We can choose to use two n-bit counters (for WP and RP for a 2**n locations deep FIFO) and a FF to remember if most recently the FIFO was running around AF (Almost Full) condition or AE (Almost Empty) condition to resolve the ambiguity caused by WP = RP which can represent FULL or EMPTY. Or we use two (n+1)-bit counters. Then the (WP=RP) represents the empty condition only and the (WP-RP) mod 2**(n+1) = 2**n represents the FULL condition.

In a two-clock FIFO, the (n+1)-bit counters option is the only option. Otherwise we can cause deadlock.

Page 5: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

From Question #1 of EE560 Final Exam of Summer 2012

Page 6: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

16-location Register Array

Writes complete at the end of the ____________ (current/next) clock.Reads are _________________ (synchronous/asynchronous).

Page 7: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Compare the three (1) a Register Array, (2) a Flow-Through BRAM, and (3) a Pipelined BRAM

Page 8: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Register Array: Glitches on WENQ are OK.Writes complete at the end of the current clock. Reads are asynchronous and continuous. Reads incur only a mux delay.

Flow-Through BRAM: Glitches on WENQ are OK.Writes complete at the end of the next clock. Reads are synchronous and incur one clock delay.

Pipelined BRAM: Glitches on WENQ are OK.Writes complete at the end of the next clock. Reads are synchronous and incur two clock delays.

SOLUTION

Page 9: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Spartan-6 FPGA Block RAM Resources User Guidehttp://www.xilinx.com/support/documentation/user_guides/ug383.pdf

For simplicity (and to be portable to other FPGAs and ASICs), in our designs we assume that no such latch is available. So we use the default Write First mode of the Xilinx BRAM which keeps the latch transparent.

Page 10: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

For simplicity and to make our design portable to other FPGAs and ASICs, we ignore the availability of this output register enable control. Moreover, we do not instantiate the BRAM. We code in such a way that a BRAM is inferred with no such control.

Page 11: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of
Page 12: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of
Page 13: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of
Page 14: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Flow-Through BRAM or Pipelined BRAM

Writes complete at the end of the next clock.

Since actual writing happens inside the memory due to internally generated timing pulses during the next (subsequent) clock, It is proper to wait until the very end of that next clock.

So is it OK to increment the WP immediately and convey the incremented WP to the consumer immediately? ?

Page 15: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Mr. Bruin: I would increment WP at the end of that subsequent clock.

Miss Bruin: I do not think that we need to delay incrementation of the WP as two clocks are lost in synchronization of the WP to form WPss anyways. So it is never a premature signalling.

Mr. Trojan: Both of you are wrong.

Page 16: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Mr. Trojan: Both of you are wrong.

Our design should allow the producer to be able to write data on consecutive clocks, if he had data ready and if there is space in the FIFO. So WP should be updated on every clock (not once in two clocks). So Mr. Bruin is wrong.

The RCLK may be much faster than the WCLK. Say RCLK is running at 100MHz and WCLK is running at 1 MHz. Then two clocks of RCLK are only 20ns which is a very small fraction of one clock of the WCLK which is running at 1 us (1000 ns). So Miss Bruin is wrong to think that there is no premature signalling.

Page 17: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Mr. Trojan's SOLUTION: Delay the conveyance of the WP by 1 clock of the WCLK by using a register as a delay register clocked by the WCLK.

Show the design below. Would you place the delay FF at A or B or C?

B C

?

A

Page 18: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Mr. Trojan's SOLUTION: Delay the conveyance of the WP by 1 clock of the WCLK by using a register as a delay register clocked by the WCLK.

Show the design below. Would you place the delay FF at A or B or C?

B CA

Page 19: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

OK, any changes in the consumer side?

Compared to the register array, reads incur one clock delay in the case of a Flow-Through BRAM and incur two clocks delay in the case of a Pipelined BRAM.

Question regarding the consumer design:

So should we have a simplistic consumer design where, once we ascertain that the FIFO is not empty and that our downstream parts are ready for consumption, initiate a read-enable and wait for 1 or 2 clocks and then receive the data and consume?Shall we design a state machine to do this?

?

Page 20: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question regarding RP pointer incrementation and conveyance of the pointer to WCLK domain

Shall we increment the read pointer RP immediately at the end of the clock in which we activate the REN (Read Enable) or when we actually receive the data one or two clocks after?

Shall we convey the incremented RP to the producer in the write-clock domain immediately or one or two clocks late as we did not yet complete the initiated consumption?

Well perhaps that depends upon whether you increment RP immediately or after one or two clocks, isn't it?

?

Page 21: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

SOLUTION to the question regarding the consumer design:

A simplistic consumer design using a state machine would be very inefficient as you would take 2 or 3 clocks to consume one data item. In the lab you are provided with an inefficient consumer design to illustrate this.

A good consumer design should allow consumption on every clock. No question about it as otherwise we defeat the very purpose of the FIFO. So there shall be no state machine or there shall be a mealy-type state machine where at times you stay in one state consuming one item per clock on consecutive clocks.

Page 22: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question: How is it possible to consume on every clock if it takes more than one clock to read an item?

Answer: Consider the following small shop selling DVD Players.

Let us say, the DVD players are imported from Japan and sold here in Los Angles. It takes a week to deliver our orders, so we always place orders in advance and stock the DVD players. So how to manage to place orders in advance in such a way that we will not have difficulty in storing them when they are delivered.

We place an order depending upon the space available in our storage minus the orders in the pipe. If there is place to store 10 more DVD players, and only three DVD players are in the pipe, we can place an order for 7 DVD players.

Page 23: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of
Page 24: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question: How is it possible to consume on every clock if it takes more than one clock to read an item?

Answer: Going by the DVD Player shop analogy, we need to have small store besides the BRAM-based FIFO to store the items initiated to be read. Say a small Register-based FIFO of 4 locations is maintained in the consumer itself.

Consumer can only initiate reading of one data item in a clock by activating REN (Read Enable). So, if the BRAM FIFO (the big FIFO) is not empty, and if the Register-based FIFO (the small FIFO) has extra space after accounting for the orders in pipe, then we can activate the REN.

Page 25: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of
Page 26: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

OK, How do we keep track of the orders in the pipe?

Corresponding to each of the delay causing registers (IReg or IREG and OReg), you keep a VALID Flip-Flop (let us call it Valid_1 (for IReg) and Valid_2 (for OReg if preset). Send a one into it when REN is activated.

Under clock edge, you need to do the following:

-- for a Flow-Through FIFO with IReg onlyValid_1 <= REN_Internal;

-- for a pipelined FIFO with IReg and ORegValid_1 <= REN_Internal;Valid_2 <= Valid_1;

Page 27: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

RENQ

VALID_1

IReg

RENQ

VALID_1 VALID_2

IReg OReg

Page 28: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Whatever you do in the consumer design, make sure that the order of consumption is the original order of production. That is the essence of the First In First Out (FIFO)!

FWFT = First Word Fall ThroughRENQ

VALID_1 VALID_2

IReg OReg

Page 29: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Can we do FWFT (First Word Fall Through) only for the small FIFO or both the big and small FIFOs?

How about the single-clock FIFO and the two-clock FIFO?

What is meant by "Snake Path" (a term coined by our

Page 30: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question regarding RP pointer incrementation and conveyance of the pointer to WCLK domain

Shall we increment the read pointer RP immediately at the end of the clock in which we activate the REN (Read Enable) or when we actually receive the data one or two clocks after?

Shall we convey the incremented RP to the producer in the write-clock domain immediately or one or two clocks late as we did not yet complete the initiated consumption?

Well perhaps that depends upon whether you increment RP immediately or after one or two clocks, isn't it?

Page 31: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question regarding RP pointer incrementation and conveyance of the pointer to WCLK domain

Answer: First we need to increment the RP pointer immediately since we hope to make the next read request on the very next clock.

Then we should refrain from conveying this new incremented RP to the WCLK domain immediately as we have not removed the data item from the memory array yet. If we do not delay conveying the RP, if the FIFO is running full and if the WCLK is much faster, then we run the risk of overwriting the oldest item (the item being read, before it is read).

So conclusion: delay the RP by one clock of RCLK before

SOLUTION

Page 32: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

RENQ RENQ

VALID_1 VALID_ VALID_2

IReg IReg OReg

Same?

Or two-clock delay?

Page 33: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

RENQ

VALID_ VALID_2

IReg OReg

Same?

Or two-clock delay?

SOLUTION

Same. 1-clock delay

Treat it as part of the consumer.This can not possibly be over-writtenby the producer.

Page 34: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Show your design.Would you place the delay FF at A or B or C?

BC A

Page 35: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Show your design.Would you place the delay FF at A or B or C?

BC

SOLUTION

A

Page 36: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Question #1 of EE560 Final Exam of Summer 2012

Partly discussed in the GRAY_1 lecture

Page 37: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

From the GRAY_1 lecture

Compare the following two designs with the one on the previous page, state if they are right or wrong. If they are right, then state whether they are faster or slower designs, cheaper or expensive.

Right / WrongFaster / SlowerCheaper / Expensive

Right / WrongFaster / SlowerCheaper / Expensive

The Gray code counter lags by 1 clock. After reset:BIN: 0 1 2 3 4 5 6GRAY: 0 0 1 2 3 4 5

Page 38: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Here, for our BRAM-based FIFO, we need a binary counter as well as a Gray Code counter.

We need to delay the Gray code by 1 clock either explicitly or indirectly.

So the following does both generation and delaying of the gray code. Shown below is WP_Gray_delayed.RP_Gray_delayed can be produced in a similar fashion.

Page 39: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Q #1.5.2.2 (and solution) of EE560 Final Su 2012

Page 40: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Q#1.5.2.3 (and solution) of EE560 Final Su 2012

Page 41: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Q#1.5.2.4 of EE560 Final Su 2012

Page 42: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of

Q#1.5.2.4 solution of EE560 Final Su 2012

Page 43: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of
Page 44: FIFOs with BRAMs (FIFO 3)€¦ · FIFOs with BRAMs (FIFO_3) EE 560. 1. Statement of the problem 2. Quick review of the FIFO_1 2-clock FIFO 3. BRAMs _ info from Xilinx 4. Impact of