SoC for Communication & MultiMulti--media Systems media...
Transcript of SoC for Communication & MultiMulti--media Systems media...
SoCSoC for Communication &for Communication &SoCSoC for Communication &for Communication &
MultiMulti--media Systems media Systems
Prof. Seongjoo LeeProf. Seongjoo LeeProf. Seongjoo LeeProf. Seongjoo Lee(([email protected]@sejong.ac.kr))
SejongSejong UniversityUniversity
ContentsContents
� Introduction to IT technologies
� Overview of SoC (System VLSI) design
� Development of SoC for communication systems
� Development of SoC for multi-media systems
[2]
� Concluding remarks
World of Information Technology World of Information Technology (1)(1)
[3]
World of Information Technology (2)World of Information Technology (2)
Wireless
Monitoring
Continuous Glucose Monitoring
[4]
Information Technology Products
Computer
* PC
Mobile
Multimedia
Converged
System
MultimediaCommunication
* PC
* Work Station
* Main Computer
* S/W
* VR* PDA
* PSTN
* Cellular / PCS
* VOD
* DMB
* DTV, HDTV
* CD, DVD* Cellular / PCS
* WLAN
* WPAN
* CD, DVD
* GAME* Graphic Medical equip.
* RemoteSensing
Industrial /
Medical* Automotive electronic
* Medical electronic
• Personal
• Handheld/Portable
• Intelligent
• Convergence
[5]
Major Research Fields in IT
Communication Control &MultimediaComputer &
RFTech
CommunicationProtocols &Algorithms
Control &Industrial Systems
System-on-Chip (SoC) & System
MultimediaDSP
Algorithms
Computer &Embeddedsystems
H/W S/W
Semi-Device
ElectronicCircuits
DataStructure
EmbeddedS/W
[6]
Overview ofOverview ofOverview ofOverview of
SoCSoC (System VLSI) Design(System VLSI) Design
세계세계 전자제품전자제품 및및 반도체반도체 시장시장 추이추이
전자제품 전자제품
반도체 반도체
반도체 장비/재료 반도체 장비/재료
연 3% 성장연 3% 성장
연 8% 성장연 8% 성장
연 10% 성장연 10% 성장
연 9% 성장연 9% 성장
� 1948년 Transistor 발명 이래 지난 50년간, 반도체 산업은 매년 13%의 고속성장
� IT 산업 중 반도체의 비중은 계속 확대 21% (2005년) � 25% (2010년)
연 9% 성장연 9% 성장
[8]
세계세계 반도체반도체 시장시장 현황현황
종류 주요 품목세계 시장매출액
한국 매출
매출액시장
점유율메모리 IC
개별소자 IC(7% : 184억불)
(단위: 억불)
시스템반도체 IC CPU, MCU1833(70 %)
48 2.6 %
메모리 IC DRAM, SRAM608
(23 %)246
40 %(세계 1위)
개별소자 IC TTL, Power IC184(7 %)
총액 2626 294
2006년반도체
세계시장(2626억불)
메모리 IC(23% : 608억불)
시스템 IC(70% : 1833억불)
한국점유율
* 자료 : KSIA, Gartner 2007
� 시스템반도체 IC는 세계 시장매출액의 70%를 차지하나, 한국은 세계 시장매출액의 2.6%를 차지함
� Memory IC는 세계 시장매출액의 23%를 차지하나, 한국은 세계 시장매출액의 40% (세계 1위)를 차지함
[9]
시스템시스템반도체반도체: 10: 10대대미래기술로미래기술로선정선정
[10]
System Implementation ApproachesSystem Implementation Approaches
1) All S/W on general purpose very high-speed microprocessor
2) H/W engine + S/W(partly) on CPU. (H/W & S/W partitioning)
Approach 1 Approach 2Approach 1 Approach 2
Very high-speedMicroprocessor
S/W
EmbeddedCPU
S/W(partly)
H/WEngine(block)
ASIC
EX) AP
S/W
Adv: easy to implement & low cost
Disadv: very hard for read-time processing /high-power
ASIC
Adv: real-time processing & reasonable modification
Disadv: high-cost for implementation/change
[11]
SoCSoC ConceptConcept
� SoC (System-on-Chip) : A complex IC that integrates
the major functions of a complete end-product into
a single chip(set)a single chip(set)
Modem/ Multimedia
Core
CPU/DSPCore
CIS/MEMS IP
Analog/RF
Digital IP
Embedded
* Cf) ARM up/AMBA bus
Embedded S/W
Analog/RF IP
Embedded Memory
[12]
SoCSoC ComponentsComponents
Spec. & Design Methodology
System Algorithm � Standardization
H/W & S/W Partitioning
• System EngineeringSystem
H/W & S/W Partitioning
Data path & Control Block opt.
CPU (& DSP) Core
Memory & Bus, etc
Analog Block (ADC, Amp, filter)
RF CMOS block (LNA, mixer, PA..)
MEMS (Sensor, Actuator…)
Optoelectronic Components
• H/W Digital
Analog/RF
IO
on
Optoelectronic Components
• S/W
� Signal Integrity & Interference� packaging
System S/WEmbedded Application S/W
Chip(or MCM)
[13]
SoCSoC Design ProcessDesign Process
SoC spec.
System algorithm(architecture)
Verification& Test plan
System algorithm(architecture)Design & Simulation
H/W & S/W partitioning
(Co-Simulation)
H/W DesignDigital (HDL) Analog/RF
Embedded S/W Design(C)
(Emulation)
Pre-verifiedIP
DFTmodule
SoC Integration &Verification in real environment
Netlist Embedded S/W
(Emulation)
[14]
Major Steps for Algorithm DesignMajor Steps for Algorithm Design
1. Decision of performance target
Decide a performance target based on the user requirements and
functional spec.functional spec.
2. Evaluation of existing algorithms
Evaluate the performance of some existing algorithms to compare
with the performance target using a behavioral-level modeling tool
(C/C++, SPW, Matlab, …)
3. Problem analysis of performance degradation
Analyze the performance degradation problem (if any)
[15]
4. Development of new algorithm
Try to develop a new algorithm to solve the degradation problem
and confirm the satisfaction of performance target
ExampleExample of Floating Point Simulationof Floating Point Simulation
� 802.11a WLAN System (SystemView)
Channel Model
[16]
Transmitter Receiver
Procedure for optimal H/W design (1)Procedure for optimal H/W design (1)
1. Define a detailed design spec. with performance target
Define a detailed design spec. of overall architecture and each
functional block
Define performance metrics such as required SNR (signal-to-noise
ratio), BER/PER, max. processing delay, max. frequency, etc
Ex) IEEE 802.11a (I)FFT block
− SQNR: min. 40dB, clock freq.: 20MHz, processing delay: max. 3.2usec
2. Decide S/W & H/W partitioning and a number of bits for
H/W to meet the required performance target
S/W vs H/W, floating-point vs fixed-point, # of bits for fixed-point
[17]
S/W vs H/W, floating-point vs fixed-point, # of bits for fixed-point
Ex) IEEE 802.11a (I)FFT block
− Choose the fixed-point method based on trade-off analysis between
performance and area/power consumption of FPU(vs. FxPU)
− Decide an optimal number of bits for fixed-point : 13bit (SQNR > 40dB)
Procedure for optimal H/W design(2)Procedure for optimal H/W design(2)
3. Decide an optimal H/W architecture for the required spec.
Develop or choose a best architecture to satisfy the required spec.
Consider trade-off analysis between performance and area/power Consider trade-off analysis between performance and area/power
consumption
Ex) IEEE 802.11a (I)FFT block
− Single butterfly structure, pipeline structure, systolic array structure
− Choose pipeline structure based on trade-off analysis
4. Design using HDL and verify the real-time operation
Design using HDL and synthesize
− Try to minimize the H/W complexity while satisfying the target
[18]
− Try to minimize the H/W complexity while satisfying the target
operating frequency and timing
Simulate at gate-level using SDF (Standard Delay Format) info.
− Verify the max. delay and normal operation
Development of modem Development of modem SoCSoCDevelopment of modem Development of modem SoCSoCfor communication system for communication system
Trend of Wireless Comm. SystemsTrend of Wireless Comm. Systems
IMTIMT--2000/2000/ WiBroWiBro
Vehicle
IMTIMT--2000/2000/
WW--CDMACDMA
WiBroWiBro
(802.16e)(802.16e) B3G (4G)B3G (4G)
NGNG--WLANWLAN
(802.11n)(802.11n)
Indoor
Outd
oor
Walk
Fixed
Walk
(802.11b)(802.11b)(802.11)(802.11)(802.11a)(802.11a)
(802.11g)(802.11g)
WLANWLAN
0.1 1 10 100 1000 Mbps
NGNG--UWBUWB
(802.15.3c)(802.15.3c)
Indoor
Fixed(WPANWPAN)
Walk
ZigBeeZigBee
(802.15.4)(802.15.4)
BluetoothBluetooth
(802.15.1)(802.15.1)
UWBUWB
(802.15.3a)(802.15.3a)
[20]
Block Block Diagram of digital comm. systemDiagram of digital comm. system
Transmitter
� Digital Communication System (DCS)
InformationInformationSource
SynchronizationSynchronization
Channel(wire, air,optical)
Channel(wire, air,optical)
Source Encoding
Channel Encoding
Multiplexing/Multiple access
Modulator
ISI
[21]
InformationRecoveredInformationRecovered
Source Decoding
DemodulatorChannel
Decoding
Demux/ Multiple access Electrical
noise & interferencesReceiver
Design of WLAN (IEEE 802.11a) Design of WLAN (IEEE 802.11a) Modem Modem SoCSoC
Network Protocol ArchitectureNetwork Protocol Architecture
Application LayerApplication LayerApplication LayerApplication Layer
Presentation LayerPresentation LayerPresentation LayerPresentation Layer
Data LinkData Link LLCLLC (Logical Link Contorl)
Presentation LayerPresentation LayerPresentation LayerPresentation Layer
Session LayerSession LayerSession LayerSession Layer
Transport LayerTransport LayerTransport LayerTransport Layer
Network LayerNetwork LayerNetwork LayerNetwork Layer
[23]
Data LinkData LinkLayerLayer
LLCLLC (Logical Link Contorl)
MACMAC (Medium Access Contorl)
PhysicalPhysicalLayerLayer
PLCP SublayerPLCP SublayerPLCP SublayerPLCP Sublayer(Physical Layer Convergence Procedure)
PMD SublayerPMD Sublayer(Physical Medium Dependent)
PMD SublayerPMD Sublayer(Physical Medium Dependent)
StandardStandard
IEEE 802.11a WLAN PHY (1)IEEE 802.11a WLAN PHY (1)
� Uses OFDM technique
� Three 100 MHz U-NII (Unlicensed) frequency bands
5.15 ~ 5.25 GHz (max. power 40 mW)5.15 ~ 5.25 GHz (max. power 40 mW)
5.25 ~ 5.35 GHz (max. power 200 mW)
5.725 ~ 5.825 GHz (max. power 800 mW)
� Signal bandwidth : 20 MHz (12 channels)
� Data rates : 6, 9, 12, 18, 24, 36, 48, 54 Mbps
� Modulation : BPSK, QPSK, 16QAM, 64QAM
� FEC : ½, ⅔, ¾ convolutional code (k=7)
Number of subcarriers : 52 (N=64)
[24]
� Number of subcarriers : 52 (N=64)
� OFDM symbol duration : 4.0 µs
� Guard interval : 0.8 µs (TGI)
� PER requirement : ≤ 10% for 1000 packets
802.11a PHY modulation scheme802.11a PHY modulation scheme
� Modulation scheme (standard)
Convolutional
Encoder
(1/2, 2/3, 3/4)
Inter-
leaver
Modulator
(BPSK, QPSK
16-QAM,
64-QAM)
Add Pilot
Subcarrier
(4 symbols)
IFFT
(64 points)
Add Guard
Interval
(16 symbols)
Symbol
wave
shaping
I Q
modS/P
HPA
AirChannel
[25]
� Block diagram of 802.11a WLAN Modem SoC
Data rate : 6~54Mbps @ 20MHz Bandwidth
Clock speed : 20/80MHz
H/W Architecture for 802.11aH/W Architecture for 802.11a
Clock speed : 20/80MHz
MacInterface
ScramblerConv.
Encoder
Puncturer&
Interleaver
SymbolMapper
IFFT&
Insert CP
RF/IFInterface
PreambleGen.
clk_80MHz clk_20MHz
[26]
Depunc.&
Deint.
ChannelEstimation
Descram.Viterbi
Decoder
FFT&
RemoveCP
Time/Freq.
Synch.
SymbolDemapper
EqualizerPhaseTracker
Top Block Diagram of Sync.Top Block Diagram of Sync.
Phase
RotateFFT
rxi
rxq
10
10
thres
Signal
Detect
Freq.
Sync.
Symbol
Sync.
RotateFFT
Sync. Controllersample_num
det_start
det_done
coarse_done
fine_done
coarse_start
fine_start
freq_offset
symbol_done
symbol_start11
11
11
symbol_max
4
[27]
- clock freq. : 20 MHz
- ADC output resolution : dual 10 bits
- Phase Rotate output resolution : dual 10 bits
Sync. Controller
agc_start
(to AGC)
sync_done
(to main_ctrl)
sample_en
(to FFT)
sync_start
(from main_ctrl)
freq_rotate
sample_num
� Block diagram of signal detect block
Schmidl Algorithm is implemented in Signal Detect block
Signal Detection BlockSignal Detection Block
Complex
Multiplierre2+im2 << 2
CompareZ-16
sum
-1
rxi
rxq
5
5
thres
sum
2
2
)(n
n
n
P
CM =
[28]
re2+im2
re2+im2
Averaging
CompareZ-16
Z-16( )2
( )2
sum
sum
det_done
(to sync. ctrl)
det_start
(from sync. ctrl) Signal Detect
IFFT / FFT Processor (1)IFFT / FFT Processor (1)
� Comparison of pipelined FFT
Multiplier (N=64) Adder (N=64)Memory size
(N=64)control
(N=64)
R2MDC 2(log4N-1) 4 4log4N 12 3N/2-2 94 Simple
R2SDF 2(log4N-1) 4 4log4N 12 N-1 63 Simple
R4MDC 3(log4N-1) 6 8log4N 24 5N/2-4 156 Simple
R4SDF log4N-1 2 8log4N 24 N-1 63 Medium
R23SDF log8N-1 1 4log4N 12 N-1 63 Simple
[29]
� Using 64-point R23SDF structureSingle path, simple control
Fast and low complexity structure
R2 SDF log8N-1 1 4log4N 12 N-1 63 Simple
IFFT / FFT Processor (2)IFFT / FFT Processor (2)
� IFFT/FFT H/W Architecture
13bit fixed point precision
Criterion : more than SQNR 40dBCriterion : more than SQNR 40dB
Simulation results
SQNRReal Imaginary
40.55dB 41.23dB
2
10 2
| Input data |SQNR 10log
|Input data - FFT results|=
32 16 8 4 2 1
[30]
BF2 BF2 BF2 BF2 BF2 BF2
W
4
W
8
W64 W
8
W
4
26 [25:13] Real
[12:0] Imag
26
H/W ImplementationH/W Implementation
� Hardware implementation result
Using 0.18um CMOS standard cell library
Block Name Logic gates
Synchronizer 69 K
Channel Estimator 31 K
Equalizer 47 K
Phase Tracker 14 K
Deinterleaver & Depuncturer 117 K
Viterbi Decoder 189 K
[31]
Viterbi Decoder 189 K
Interleaver & Puncturer 28 K
Mapper & Demapper 11.5 K
FFT / IFFT 45 K
TOTAL 550 K
FPGA implementationFPGA implementation
� PLCP processor
Conv. encoder, Viterbi decoder, (De)puncturer, (De)interleaver
� PMD processor� PMD processor
Symbol (De)mapper, (I)FFT, Estimator & Equalizer
PCI controller
PLCP processor PMD processor
PCI controller
PLCP processor PMD processor
[32]
IEEE 802.11n WLAN ModemIEEE 802.11n WLAN Modem
ApplicationsApplications
� WLANSupport 200~600Mbps data rate within 100m rangeHigh speed wireless multimedia service using laptop computer, High speed wireless multimedia service using laptop computer, home theater, and handheld device (PDA, MP3P, PMP, etc)Mobile internet and VoIP with WLAN-equipped cellular
[34]
PHY Spec.PHY Spec.
Feature Description
RF Frequency 2.4 GHz / 5.24 GHz
# of Spatial Streams & TX antennas 1~4# of Spatial Streams & TX antennas 1~4
Channelization bandwidth 20MHz / 40MHz
# of Occupied Subcarriers (52, 56) @ 20MHz / (104, 114) @ 40MHz
# of Data Subcarriers (48, 52) / 108
# of Pilot Subcarriers 4 / 6
# of FEC encoder 1~2
Modulation Order BPSK, QPSK, 16-QAM, 64-QAM
Code Rate 1/2, 2/3, 3/4, 5/6
[35]
Guard Interval 800ns / 400ns
Convolutional Coding R=1/2, K=7, (g1=1338, g2=1718)
Peak data rate 144.40Mbps (Mandatory) / 600Mbps (Optional)
MIMO scheme Direct Map, SDM, STBC
PPDU Format Legacy, HT (Mixed Mode, Green Field Mode)
MAC Spec.MAC Spec.
Feature Description
Frame AggregationMSDU aggregation (Max. 64 MSDUs and 7,935 bytes)
MPDU aggregation (Max. 64 MPDUs and 65,535 bytes)MPDU aggregation (Max. 64 MPDUs and 65,535 bytes)
Efficiency Improvements
Reverse direction data flow
Block Ack with A-MPDU
Link ManagementLink adaptation with MCS feedback information
Transmit/receive antenna selection using sounding PPDU
Power Saving Power-Save Multi-poll
L-SIG TXOP Protection
[36]
Coexistence
L-SIG TXOP Protection
Channel selection (20 MHz, 40MHz)
Phased coexistence operation
PHY H/W Block diagramPHY H/W Block diagram
IFFTFEC
MapperCliiping/Filtering
Interleaver
Modulation
IFFT
IFFT
IFFT
MIMOOFDMEncoder
Stream mapper
FEC Encoder
MACInterface
FFT
FFT
RF/IFInterface
Viterbi Decoder
FEC Encoder
Mapper
Mapper
Mapper
Filtering
Cliiping/Filtering
Cliiping/Filtering
Cliiping/Filtering
Stream demapper
Deinterleaver
Deinterleaver
Cliiping/Filtering
Cliiping/Filtering
Cliiping/
Interleaver
Interleaver
Interleaver
MIMOOFDMDecoder
Demapper
Demapper
[37]
FFT
FFT
ViterbiDecoder
ChannelEstimator
demapperDeinterleaver
Deinterleaver
Cliiping/Filtering
Cliiping/Filtering
DecoderDemapper
Demapper
Demodulation
MAC Functional Block DiagramMAC Functional Block Diagram
[38]
Major Research Topics (1)Major Research Topics (1)
� MIMO symbol detection algorithms
High-performance symbol detection algorithm (Near-ML) for STBC-OFDM and SDM-OFDM schemesSTBC-OFDM and SDM-OFDM schemes
Hardware optimized detection architecture
� High-performance algorithms
Time/Frequency synchronization
Phase offset tracking & compensation
Channel estimationChannel estimation
[39]
Major Research Topics (2)Major Research Topics (2)
� Low-complexity & high-throughput LDPC encoder/decoder
To achieve error correction performance close to Shannon’s To achieve error correction performance close to Shannon’s limit
Low complexity & high-throughput architecture
� High performance MAC processor
Multiple frame processing for frame aggregation
Cyclic frame buffer architecture to handle long framesCyclic frame buffer architecture to handle long frames
H/W & S/W optimum partitioning
[40]
Chip ImplementationChip Implementation
Block Logic Gate Count
MAC 176,242
TX controller & FEC 5,417
Interleaver & 141,902
TechnologyDongbu 0.18µm
CMOS process
Package 208pin LQFP
Core Size 5 x 5 mm2
[41]
Interleaver & Deinterleaver
141,902
LDPC Encoder 216,600
Viterbi 190,759
RX controller 14,772
Total 745,692
Clock speed 40MHz
Supply Voltages 1.8V Core, 3.3V I/O
Power Consumption
(mW)
TX : 62
RX : 284
LDPC encoder : 70
Test EnvironmentTest Environment
� Chip test : Chip + FPGA + Test Board
The image in the TX monitor is processed by chip & FPGA test board. board.
Demodulated image is displayed in the RX monitor.
11n test vector 및PMD
MPW Chip
TX
[42]
DVI
Interface USB Interface
DVI
Interface
RX
ZigBeeZigBee (IEEE 802.15.4) Modem(IEEE 802.15.4) Modem
ApplicationsApplications
� ZigBee system (IEEE 802.15.4 standard)Application in WPAN & WSN systemLow data transmission (250K ~ 1Mbps) with low power in 10m rangeLow data transmission (250K ~ 1Mbps) with low power in 10m range
Home Security, Home AutomationHealth care equipment systemEnvironment, water quality and bridge safety surveillance systemSmart toy, PC Peripherals
Network Camera with ZigBee
Lighting with ZigBee
Light ON
Mobile
SMSReal time Saving image
Cellular Network
[44]
Burglar Surveillance
Auto-Image saving
RG Alarm with ZigBee
CDMA/WCDMAReal time Saving image receiving
Security company
PHY/ MAC Spec.PHY/ MAC Spec.
Feature Description
RF Frequency 2.4 GHz (World wide)RF Frequency 2.4 GHz (World wide)
868 MHz (Europe) /915 MHz (North America)
Channelization bandwidth 2.4 GHz : 2 MHz
868/915 MHz : 300KHz/600KHz
Modulation 2.4 GHz : OQPSK
868/915 MHz : BPSK
Data rate 2.4 GHz : 250 kb/s
868/915 MHz : 20/40 kb/s
[45]
868/915 MHz : 20/40 kb/s
Coverage 10m
Mobility NO
Medium Access Control CSMA/CA
PHY H/W Block diagramPHY H/W Block diagram
Modulation
1symbol_en
1chip_en1
ppdu_en6
pulse_re
MACInterface
RF/IFInterface
BitTo
Symbol
SymbolTo
SymbolTo
Chip
O-QPSKModulation
Non-coherentDemodulation
1
4
symbol_en
symbol_out
1
1
chip_en
chip_out
1
4
symbol_en
symbol_out
1
1
1
ppdu_in
stb_en
6
1
pulse_en
6pulse_im
4
4
sample_re
sample_im
[46]
Demodulation
ToBit
Sample Synchronizaion
1 demod_en1
ppdu_out
sync_en1
MAC Functional Block DiagramMAC Functional Block Diagram
[47]
Chip ImplementationChip Implementation
0.35um CMOS
Baseband Core 65.5k
Process8051
Processor
BasebandPHY
4500um
Baseband Core 65.5k
Baseband Controller 4.7k
8051 Microcontroller 15k
etc 3.3k
Total 91.2k
4500um x 4500um
CQF208
3.3V
Die Size
Gate Count
Package
Operating
Voltage
ProcessorCore
MACS/W
PHYModemCore
Peripherals
[48]
4500um
4500um
System Main 22.1184MHz
Baseband (ADC/DAC) 4MHz
42mW
3.3V
Ave. Power (est.)
Clock
Frequency
Voltage
Test EnvironmentTest Environment
� Chip test : Chip + FPGA + Test Board
Communication with PC thru UART
With RF moduleWith RF module
Forming WPAN and data transmission
Without RF module
Data transmission LR_WPAN
[49]
Coordinator Device
Development of Development of SoCSoCDevelopment of Development of SoCSoCfor multifor multi--media systems media systems
Digital Broadcasting SystemsDigital Broadcasting Systems
3D
QualityQuality
3DTV(Multi-View)2005
2010
SmarTV(Super-intelligent Multimedia
Realistic TVHolographic TV
SD
UD
HD
VGA
UDTV(2000x2000)
SDTVEurope: DVB-T
Giga DCATV(1Gbps)
2000
Bi-direction DMB
anywhere Realistic TV) (Super-intelligent Multimedia Anytime-anywhere Realistic TV)
USA/Korea: DOCSIS
DCATVUSA/Korea: DOCSISEurope: DVB-C
Satellite DTVEurope/Korea: DVB-S
Satellite DTVEurope/Korea: DVB-S
USA/Korea: ATSCJapan: ISDB
HDTVUSA/Korea: ATSCJapan: ISDB
Audio
CIF
InteractivityInteractivity
Analog TVAnalog CATVFM/AM
147
DABUSA: IBOC/IBACEurope: Eureka-147Japan: ISDB-T
Uni-direction DMB
China: DMB-T
Uni-direction DMBKorea: T-DMB/S-DMBUSA: FLO(Qualcomm)Europe: DVB-HJapan: ISDB-TSBChina: DMB-T
Bi-direction DMB
[51]
Image Signal ProcessorImage Signal ProcessorImage Signal ProcessorImage Signal Processor
for CMOS Image Sensorfor CMOS Image Sensor
ISP for CIS (CMOS Image Sensor)ISP for CIS (CMOS Image Sensor)
SceneScene LensLens CFACFA SensorSensor
ImageImageSignalSignal
ProcessorProcessor
RGBRGBColor ImageColor Image
ChannelChanneloror
StorageStorage
Bayer ImageBayer Image
[53]
Image Signal ProcessorImage Signal Processor
� Image Processing for CIS image
Pre-processing unit to enhance Bayer image
Fixed Pattern Noise (FPN) Removal, Vignetting Reduction, Auto FocusFixed Pattern Noise (FPN) Removal, Vignetting Reduction, Auto Focus
Color Interpolation
Bayer to RGB
Image Enhancement of RGB image
Auto Exposure, Anti - Color Rolling
[54]
<Image Signal Processor>
Image Data FormatImage Data Format
� Input data : Bayer Image
Photo detector converts incident radiant power into photocurrent that is proportional to the radiant powerphotocurrent that is proportional to the radiant power
One color filter for one sensor unit for sufficient photo energyconsumption
� Output data : RGB Image
3 primary colors of image
8/10/12 bits color
[55]
Fixed Pattern Noise Removal (FPNR)Fixed Pattern Noise Removal (FPNR)
� Detect & remove sensor defect pixels (Hot/Dead pixel)
FPN Detection
Cluster
FPN
FPN Correction
[56]
* By (kind of) Median Filter
VignettingVignetting Reduction (VR)Reduction (VR)
� Correct distortion image generated by Lens edge
Using Anti-Vignetting function to remove Vignetting phenomenon
[57]
Auto Focus (AF)Auto Focus (AF)
Focusing Index
� Provide focusing index gradient for Lens focus
Before AF After AF
[58]
Color Interpolation (CI)Color Interpolation (CI)
� Color Interpolation
Convert Bayer image to RGB image
f(0,0)
f(1,1)f(1,0)
(f0,3)f(0,2)f(0,1)
f(1,3)f(1,2)
f(2,0) f(2,3)f(2,2)f(2,1)
Color interpolation
G B G B
R G R G
G B G B
j
f(3,1)f(3,0) f(3,3)f(3,2)R G R G
i
[59]
Auto Exposure (AE)Auto Exposure (AE)
� Contrast enhancement
Make picture contrast better for human eyes
Operate local contrast enhancement and global contrast Operate local contrast enhancement and global contrast enhancement (by histogram analysis)
Before After
[60]
AntiAnti--Color rolling (ACR)Color rolling (ACR)
� Color correction for each frame using mean adjustment
� Auto White Balance (AWB) in Video sequence
#
2
#1
Before correction After correction
#1
50
[61]
ACR Result ImageACR Result Image
# 150 # 250 # 350 # 450
Input
Output
[62]
Target M
ap
H/W Design IssuesH/W Design Issues
� Development of each algorithm for low-complexity H/W
Optimum H/W architecture� Optimum H/W architecture
� Reduction of line memory & multipliers
� Low-power design
FPGA Verification (1)FPGA Verification (1)
� H/W verification with FPGA
Verify with Xilinx Virtex-4 LX200
HyVision Interface board is usedHyVision Interface board is used
[64]
FPGA VerificationFPGA Verification (2)(2)
� FPGA board for H/W verification
[65]
H.264/AVC H.264/AVC H.264/AVC H.264/AVC
Video Encoder/DecoderVideo Encoder/Decoder
Video Compression StandardsVideo Compression Standards
H.261(1990)
H.263(1995/1996)
H.263+(1997/1998)
H.263++(2000)ITU-T
Video phone: PSTN, B-ISDN
Low Quality: 64~1.5Mbps(1995/1996) (2000)
MPEG-2(H.262)
(1994/1995)
MPEG-1(1993)
MPEG-4 v1(1998/1999)
MPEG-4 v2(1999/2000)
MPEG-4 v3(2001)
ITU-TVCEG
ISO/IECMPEG
H.264(MPEG-4Part 10)(2002)
MPEG-7 MPEG-21Video CD, Internet
VHS Quality:<1.5Mbps
Digital Broadcasting
High Quality: 1.5~100Mbps
Broadcasting
Video Conference
Various Quality:
64kbps~240Mbps
Multimedia content search & filtering Digital Item Distribution over network
H.NGVC& HVC(2010)
[67]
1990 1995 2000 2005 2010
MPEG-7(2001)
MPEG-21(2003)
JPEG(1992)
JPEG2000(2000)
MotionJPEG2000(2002)
Digital Camera
Internet
VHS Quality:<1.5Mbps
Digital Cinema
Digital Editing
Compression of Image DataCompression of Image Data
� Full HD image
(1920ⅹ1080) resolutionⅹ3 Colors(R,G,B)/elementsⅹ8 bit/elementⅹ60 frames/sec ⅹ8 bit/elementⅹ60 frames/sec = 2.96 Gb/sec for real-time operation
� Compression Data
Un-compressed
MPEG-2(40~50배 압축)
H.264(MPEG2 대비 2배)
H.NGVC(H.264 대비 2배)
SD(640x480)
450 Mb/s ~ 9 Mb/s ~ 4.5 Mb/s ~ 2.3 Mb/s
[68]
(640x480)450 Mb/s ~ 9 Mb/s ~ 4.5 Mb/s ~ 2.3 Mb/s
Full HD(1920x1080)
2.96 Gb/s ~ 60 Mb/s ~ 30 Mb/s ~ 15 Mb/s
4k UHD(3840x2160)
11.8 Gb/s ~ 240 Mb/s ~ 120 Mb/s ~ 60 Mb/s
Comparison of image qualityComparison of image quality
[69]
H.264 MPEG4
< Spec : S-DMB, QVGA, 384kbps >
Basic Coding Structure of H.264Basic Coding Structure of H.264
Control
Data
Coder
Control
Transform/
Input
Video
Signal
Encoder
Entropy
Coding
Scaling & Inv.
Transform
Motion-
Quant.
Transf. coeffs
Decoder
Transform/
Scal./Quant.-Split into
Macroblocks
16x16 pixels
Intra-frame
Prediction
De-blocking
Filter
Decoder
Output
File
Data
[70]
Motion-
Compensation
Motion
Data
Intra/Inter
Motion
Estimation
Output
Video
Signal
Basic Operations of H.264Basic Operations of H.264
� Convert RGB to YUV (Luminance & Chrominance)
YUV format : Color data is divided by luminance and chrominance informationchrominance information
� Prediction
Intra-frame Prediction
ME/MC (Inter-frame Prediction)
� Transform
DCT (Discrete Cosine Transform)
Quantization� Quantization
� Deblocking Filter
� Entropy Coding
[71]
IntraIntra--frame Predictionframe Prediction
CurrentBlock
� Exploit spatial redundancy on single frame
< Original Image > < Current block >
B1
B1
B’1 Upper
Block
Predicted block fromReconstruction Image
[72]
< Prediction block >< Recon. Image >
B’1
B’1 Upper
Left
ME/MC (InterME/MC (Inter--frame Prediction)frame Prediction)
� Exploit temporal redundancy between continuous two frames
� Motion Estimation (ME)
Find a motion vector
� Motion Compensation (MC)
Generate reference frames using results of motion estimation
Video clip
[73]
using results of motion estimation
Current frameReference frame
Motion vector
Transform & QuantizationTransform & Quantization
� Compact the energy into as few coefficients as possible using DCT (Discrete Cosine Transform)
[74]
DeblockingDeblocking FilterFilter
� Maintain the sharpness of real edges
� Smooth the unpleasant block boundaries
[75]
< Before filtering > < After filtering >
Entropy CodingEntropy Coding
� Generate output encoding data
1) CAVLC (Context-based Adaptive Variable Length Coding)
Utilizes multiple variable length codeword tables for transform Utilizes multiple variable length codeword tables for transform
coefficient encoding, with a single table used for non-coefficient data.
2) CABAC (Context-based Adaptive Binary Arithmetic Coding)
Multiplication-free low-complexity methods using only shifts and table look-ups, providing a reduction in bit-rate
[76]
FPGA verification for H.264FPGA verification for H.264
� H.264/AVC Baseline profile Encoder/Decoder
Real-time verification using Xilinx Virtex5 LX330 FPGA
[77]
Concluding remarksConcluding remarks
� System_on_chip is one of top priority in IT research field
SoC designer should have an ability to planning of SoC � SoC designer should have an ability to planning of SoC
development based on trends of user demands & system
� SoC architect is the most important person who
understand the total system and interface
[78]
understand the total system and interface
� (also H/W & S/W partitioning and co-design issues)