Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
-
Upload
fpga-central -
Category
Technology
-
view
2.691 -
download
4
description
Transcript of Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memories; interfaces & controllers
Sandeep KulkarniArea Technical Managerg
Memory Types
Does not req ire refresh access is
VolatileVolatile
SRAMSRAM Does not require refresh, access is easier. Special types based on access methods. Used for faster access and low power
Di it lDi it l
DRAMDRAM Dynamic RAM, requires periodic refreshing. Uses transistor and capacitor to store charge. Is compact and denser
Digital MemoryDigital
Memory
EEPROMEEPROM Byte erasable, limited write cycles, faster read, ser/parallel
NonVolatileNonVolatile
NOR & NAND type block eraseFLASHFLASH
NOR & NAND type, block erase, lower cost, denser,ser/parallel
SRAM sub-types & applications
AsyncAsync• upto32Mb, fast 8ns
• Upto 333Mhz, concurrent R/W burstQDRIIQDRII concurrent R/W,burst support, DDR data
• Sync/Async,250MhzSRAMSRAM FIFOFIFO
Sync/Async,250Mhz
• Random access uptoDPRAM/MPMDPRAM/MPM • Random access, upto 200Mhz
• Associative returnsCAMCAM
Associative, returns address based on data search
SDRAM memory subtypes
SDR • Upto 133Mhz,LVCMOS, used in
DDR
pembedded systems
• Upto 200Mhz, SSTL18, source synchronous
DDR2 • Upto 400Mhz,SSTL18, diff. strobe.
SDRAMDDR3
strobe.
• Upto 800Mhz,SSTL15,flyby hit t
RLDRAM/2
architecture• Reduced latency, 533Mhz, high
bandwidth, high density SRAM like random access
LPDDR/2SRAM-like random access
• Lowpower, upto 400Mhz
FPGA On Chip Ram
• FPGA has primarily 2 types of on-chip RAMp y yp p– Block RAM
» SRAM memory block of size 9K/18K/36KS t lti l d f ti» Supports multiple modes of operation: ROM/RAM/DPRAM/FIFO etc.
» Parameterisable aspect ratios, cascadableFAST t 600Mh» FAST upto 600Mhz
– Distributed RAM» LUT configured as memory:4i/p LUT = 16x1g y p» Localized Very FAST & efficient» Supports multiple modes of operation:
ROM/RAM/DPRAM/FIFO» Cascadable, used for shallow /small memory requirement
On chip flash - FlashBAK Technology
Make Infinite Reads & Writes to EBR @ Speeds of
up to 350MHz
Write to Flash During Programming
Flash
FPGAEBR
JTAG / SPIPORT
up to 350MHz
FPGALogic Write From Flash to EBRs
During Configuration /
Write From EBRs to Flash
• Use FlashBAK to Store:
Write From EBRs to Flash on User Command
– Error Codes, POST Results, Serial Numbers and uP Code
• Erase and Reprogram Flash in <3 seconds • sysMEM EBR 166 to 885Kbits• Unlimited Random Read and Write Capability through EBR• Other types are SerialTag,UFM etc.
Memory in Typical Networking Application
Memory Organization – DDR2
Source:Micron
Read Cycle – DDR2
DDR2 Access
R d f W it t
• Source Synchronous Data(DQ) from memory is edge aligned w.r.t. strobe(DQS).
Read from memory Write to memory
g g ( Q )• Data writes to memory have to be centre aligned• Tight timing budget Timing for data valid window• Tight timing budget. Timing for data valid window
at 266MHz ~1ns. Precise timing control is crucial.
DDR2 IO implementation
• To capture read data properly data strobe alignment has to be performed in the fpga io’s g p pgwhich should be compensated for PVT and works on wide range of frequency. Multiple techniques exists to accomplish this.exists to accomplish this.
DQSDLL+DQSBUF Method
• Dedicated circuitry in the IOB takes care of the data strobe alignment
READREAD
DQSI
SCLK
DQSDLL provides digital delay code for PVT compensated 90 degree shift
DDR Registers in IOB
• The IOB contains DDR registers to perform– DDR to SDR– DDR to SDR – Half clock transfer– Synchronization & Clock transfer
IOB DDR Data Transfer timing diagram
Abstraction
• Memory Controllers offer abstraction and ease of use to designer
• Can be parameterized to support a many types of memories, data width, speed etc.
• Takes care of initializing the memory• Tracks the Read/Write and controls Refresh• Takes care of the memory timing requirements• Offers a complete data/command/add interface to
user for integration in the design.• Command queuing and command burst improves
b tili ti d th h tbus utilization and throughput• Intelligent bank management to optimize
performanceperformance
Typical DDR Memory Controller Block Diagram
Memory Controller User Interface
• Local interface signals groups simplify operationI i i li i & A R f h– Initialization & Auto Refresh
– Command & Addr– Data Writeata te– Data Read
• Example command interfacep
USER Commands & Data R/W
Data Write on User Interface
USER Commands
READ Data on User Interface
DDR Memory controller implementation
1. Core generation (Using IPexpress)2. Simulation (Eval scripts)3. Implementation (Synthesis & PAR)p ( y )4. Result evaluation (Utilization, Static timing)5 Pinout validation (PCB layout)5. Pinout validation (PCB layout)6. Backend design
Comparison of DDR Memory Standards
DDR3 Advantages
• Lower Power Lower Power– 1.5V
• Higher Speed– 400MHz ~ 800MHz
• Master ResetInitialization– Initialization
• More Performance– 2x DDR2
• Larger Densities– 8Gb/32GB
DDR3 Power Advantage
• Supply voltage reduced from 1 8V to1 5V• Supply voltage reduced from 1.8V to1.5V– More than 15% power saving• Slower core speed– DDR2-800:DDR2 (400MHz) / Core (200MHz)– DDR3-800:DDR3 (400MHz) / Core (100MHz)• Lower I/O buffer power– 34 ohm driver vs. 18 ohm driver (DDR2)• ~25 to 30% lower power than the same performance 25 to 30% lower power than the same performance
DDR2
DDR3 8n-Prefetch Architecture
DDR3 High Speed Signaling
• Fly-by routing• Write and Read Levelingg• ZQ Calibration through ZQ resistor• Dynamic ODT for improved WRITE signaling• Dynamic ODT for improved WRITE signaling
APPENDIX
Market Trends-Technology transition
Source:iSuppli
Market trends-Price per bit
Source: Microsoft
Key Memory Timing parameters
• CAS Latency : CL– The time between sending a column address to the memory and the
beginning of the data in response. This is the time it takes to read thebeginning of the data in response. This is the time it takes to read the first bit of memory from a DRAM with the correct row already open.
• ACTIVATE-to-READ or WRITE delay: tRCD– The number of clock cycles required between the opening a row of y q p g
memory and accessing columns within it. The time to read the first bit of memory from a DRAM without an active row is TRCD + CL.
• PRECHARGE period: tRP– The number of clock cycles required between the issuing of the
precharge command and opening the next row. The time to read the first bit of memory from a DRAM with the wrong row open is TRP + TRCD + CL.
• ACTIVATE to PRECHARGE delay: t• ACTIVATE-to-PRECHARGE delay: tRAS– The number of clock cycles required between a bank active command and issuing
the precharge command. This is the time needed to internally refresh the row, and overlaps with TRCD. Typically approximately equal to the sum of the previous three numbers.numbers.
• Others:tRC,tRRD,tRFC,tRTP,tWTR etc.