Post on 11-Jun-2020
Eng. Julian S. Bruno
REAL TIMEREAL TIMEDIGITALDIGITALSIGNAL
PROCESSING
Seminario de Electrónica: Sistemas EmbebidosFI-UBA 2010
IntroductionWhy Digital? A brief comparison with analog.
FI-UBA 2010 Eng. Julian S. Bruno
AdvantagesAdvantages
Flexibility. Easily modifiable and upgradeable. Reproducibility. Don’t depend on p y p
components tolerance. Exactly reproduced from one unit to other.
Reliability. No age or environmental drift.Comple it Allo s sophisticated applications Complexity. Allows sophisticated applications in only one chip.
FI-UBA 2010 Eng. Julian S. Bruno
The BIG pictureThe BIG picture
D t
Real timeReal time
Data
Real time algorithmsReal time algorithms
Results
FAST
FI-UBA 2010 Eng. Julian S. Bruno
Real Time DSP SystemFAST
Sampling signals: A very important fifirst step.
Fc Fs
N
LF
MIPSMFLOPS
FI-UBA 2010 Eng. Julian S. Bruno
Real Time DSP System
Sampling low-pass signals (CT)Sampling low pass signals (CT)
The sampling theorem indicates that a continuous signalThe sampling theorem indicates that a continuous signal can be properly sampled, only if it does not contain frequency components above one-half of the sampling
trate.
FI-UBA 2010 Eng. Julian S. Bruno
NS ff 2Nyquist sampling theorem
Aliasing and frequency ambiguityAliasing and frequency ambiguity
FI-UBA 2010 Eng. Julian S. Bruno
Sampling band-pass signalsSampling band pass signals
IF samplingHarmonic
samplingsamplingSub-Nyquist
samplingUndersamplingp g
for any positive integer m22 BfBf
FI-UBA 2010 Eng. Julian S. Bruno
for any positive integer m, where fs ≥ 2B is accomplished.1
22
mBff
mBf c
sc
Sampling band-pass signalsSampling band pass signals
m (2Fc-B)/m (2Fc-B)/(m+1) Optimum Fs
1 35.0 MHz 22.5 MHz 22.5 MHz
2 17.5 MHz 15.0 MHz 17.5 MHz
3 11.66 MHz 11.25 MHz 11.25 MHz
4 8 75 MH 9 0 MH4 8.75 MHz 9.0 MHz -
5 7.0 MHz 7.5 MHz -
Optimum Fs is defined here as that optimum frequency p q ywhere spectral replications do no butt up against each other
FI-UBA 2010 Eng. Julian S. Bruno
except at zero Hz
Reconstruction signalsReconstruction signals
FI-UBA 2010 Eng. Julian S. Bruno
Real Time DSP System
Reconstruction signalsReconstruction signals
Analog signal XAnalog signal X can be Analog signal XAnalog signal X can be reconstructed from its samples by using thef ll i f l
S
SS
k TkTtkTXtX sinc)()(
following formula: The reconstructionreconstruction is
based on the interpolationinterpolation
Sk
based on the interpolationinterpolationof shifted sincsinc functionsfunctions..
It is very difficult to generate i f ti b l t isinc functions by electronic
circuitry. An approximation of a sinc An approximation of a sinc
function is a pulse. Sample Sample and holdand hold circuit performs this approximationthis approximation.
FI-UBA 2010 Eng. Julian S. Bruno
Reconstruction ErrorsReconstruction ErrorsSample and hold circuits
The gain in the desired central band is not constant
FI-UBA 2010 Eng. Julian S. Bruno
The are high-frequency replica of the signal spectrum
Reconstruction SolutionsReconstruction Solutions
The gain in the desired central band is not constant
It is possible to compensate for this non-ideality by p p y yusing an inverse filter as part of the DSP component
The are high-frequency replica of the signal spectrum
FI-UBA 2010 Eng. Julian S. Bruno
which can be removed by using a lowpass filter
Reconstruction Errors ExampleReconstruction Errors Example
FI-UBA 2010 Eng. Julian S. Bruno
Real Time constraintsReal Time constraints
Signal Path
FI-UBA 2010 Eng. Julian S. Bruno
Real Time DSP System
Real time constraintsReal time constraints
Algorithms time (tA) MUST fit between two consecutive sampling periods (tS).
Thus tA limits the maximum frequency that a system can work.
The definition of real time is VERY application dependant (faster speed of evolution of the system).( y )
FI-UBA 2010 Eng. Julian S. Bruno
Real time constraintsReal time constraints
Bl k P i M d Block Processing Mode 4 memory buffers of
length N are required f fffor double-buffering method.
2 memory buffers (in y (and out) are needed for internal processing by the processor.
A delay of 2NTs is incurred in block processing.
More complicated programming is needed More complicated programming is needed to manage the switching between buffers.
Can be configured the ADC and DAC to transfer data samples into the internal
FI-UBA 2010 Eng. Julian S. Bruno
transfer data samples into the internal memory of processor using the serial ports and the DMA.
DSP hardwareDSP hardware
FI-UBA 2010 Eng. Julian S. Bruno
Real Time DSP System
What can we do with a DSP?What can we do with a DSP?
Almost any linear and nonlinear system (PID controller).
Digital filters (FIR-IIR). Adaptive systems (LMS algorithm) Adaptive systems (LMS algorithm). Modulators and demodulators. Any mathematical intensive algorithm (FFT-
DCT-WT).
FI-UBA 2010 Eng. Julian S. Bruno
Linear systems implementationLinear systems implementation
Being x(n) and h(n) are arrays of Being x(n) and h(n) are arrays of numbers. If we want to compute y(n) we have to multiply and sum the last M samples being M the length of h(n)samples, being M the length of h(n). This repeated for every new sample received from de ADC.
As you can see, any linear system uses multiplications, accumulations ( ) d l i t i l
FI-UBA 2010 Eng. Julian S. Bruno
(sums), and loops intensively.
Fast Fourier Transform FFTFast Fourier Transform FFT
FI-UBA 2010 Eng. Julian S. Bruno
Summary of desirable features of a DSPDSPF t i th ti ti d Fast in mathematics operations, and combinations of them (multiply and sum
i ll )specially). Flexible addressing modes (bit reversal, circular
ff )buffers, zero overhead loops) DSP specific instruction set (arithmetic shifting, ( g
saturating arithmetic, rounding, normalization) Minimum overhead peripherals (communications p p (
devices specially) DSP instructions for specific applications (Video, DSP instructions for specific applications (Video,
Control, Audio)FI-UBA 2010 Eng. Julian S. Bruno
So those are DSP math featuresSo, those are DSP math features
M lti l d A l t Multiply and Accumulators (MAC’s) units.
ALU’s (fixed and floating ALU s (fixed and floating point).
Barrel shifters. Depending on DSP
application, more than one it t i dunit are present in modern
DSP’s, allowing parallelism. Harvard (modified) Harvard (modified)
architecture provide multiple operations per cycle.
FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient P iProgramming
Specialized addressing modesp g Hardware Loop Constructs
Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline Another important features Another important features
FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient P iProgramming
SpecializedSpecialized addressingaddressing modesmodespp gg Hardware Loop Constructs
Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline Another important features Another important features
FI-UBA 2010 Eng. Julian S. Bruno
Specialized Addressing ModesSpecialized Addressing Modes
Circular buffering Bit-Riversal
B0 = 0x00; L0 = 44; // Base and lengthI0 = 0x00; M0 = 16; // Index and increment
B0 = 0x00; L0 = 0; // Base and lengthI0=0; M0=1; // Index and incrementI2=256; P0 = 8;
R0 = [I0++M0]; // R0=1 & I0=0x10R0 = [I0++M0]; // R0=5 & I0=0x20R0 = [I0++M0]; // R0=9 & I0=0x04R0 [I0 M0] // R0 2 & I0 0 14
LOOP(start, end) LC0 = P0;start: // I0 automatically incremented in B-R progressionR0 = [I0] || I0 += M0 (BREV);end: // I2 point to bit-riversed buffer
R0 = [I0++M0]; // R0=2 & I0=0x14R0 = [I0++M0]; // R0=6 & I0=0x24
end: // I2 point to bit riversed buffer[I2++] = R0;
FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient P iProgramming
Specialized addressing modesp g Hardware Hardware LoopLoop ConstructsConstructs
Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline Another important features Another important features
FI-UBA 2010 Eng. Julian S. Bruno
Hardware Loop ConstructsHardware Loop Constructs
Looping is a critical feature in communications processing algorithms.
There are two key looping-related features that can improve performance on a wide variety of p p yalgorithms: “zero-overhead hardware loop” zero overhead hardware loop “hardware loop buffers”
FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient P iProgramming
Specialized addressing modesp g Hardware Loop Constructs
CacheableCacheable memoriesmemories CacheableCacheable memoriesmemories Multiple operations per cycle Interlocked pipeline Another important features Another important features
FI-UBA 2010 Eng. Julian S. Bruno
Cacheable memoriesCacheable memories
Today’s high-speedprocessors would effectively run at much yslower speeds because larger applications would only fit in slower external memory.
Programmers would be forced to manually move ykey code in and out of internal SRAM.
Adding data and instruction gcaches into the architecture, external memory becomes much
blmore manageable.
FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient P iProgramming
Specialized addressing modesp g Hardware Loop Constructs
Cacheable memories Cacheable memories MultipleMultiple operationsoperations per per cyclecycle Interlocked pipeline Another important features Another important features
FI-UBA 2010 Eng. Julian S. Bruno
Multiple operations per cycleMultiple operations per cycleIn addition to In addition toperforming multiple ALU/MAC operations eachoperations each core processor cycle, additional data loads anddata loads and stores can also be completed in the same cycle.y
The memory is typically portioned into sub-banks thatinto sub banks that can be dual-accessed by the core and optionally p yby a DMA controller.
FI-UBA 2010 Eng. Julian S. Bruno
There are two multi-issue architectures: VLIW and superscalar
Architectural Features for Efficient P iProgramming
Specialized addressing modesp g Hardware Loop Constructs
Cacheable memories Cacheable memories Multiple operations per cycle InterlockedInterlocked pipelinepipeline Another important features Another important features
FI-UBA 2010 Eng. Julian S. Bruno
Interlocked pipelineInterlocked pipeline
I d t i th h t DSP d i d t In order to increase throughput, DSPs are designed to be pipelined
When assembly programming is required the pipeline When assembly programming is required, the pipeline can make programming more challenging.
The processor automatically handles stalls and The processor automatically handles stalls and bubbles.
FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient P iProgramming
Specialized addressing modesp g Hardware Loop Constructs
Cacheable memories Cacheable memories Multiple operations per cycle Interlocked pipeline AnotherAnother importantimportant featuresfeatures AnotherAnother importantimportant featuresfeatures
FI-UBA 2010 Eng. Julian S. Bruno
Another important featuresAnother important features
RISC lik i t d i t ti t RISC like registers and instruction set Multiple data/program buses. DMA controller for handling peripherals In traditional fixed-point DSPs, word sizes are In traditional fixed point DSPs, word sizes are
usually fixed. However, there is an advantage to having data registers that can be treated as:to having data registers that can be treated as: One 64-bit word Two 32 bit word Two 32-bit word Four 16-bit word Eight 8-bit word
FI-UBA 2010 Eng. Julian S. Bruno
DSP clasificationDSP clasification
Fixed or Floating point arithmetic. Millions of multiply–accumulate operations per p y p p
second, MMACs. Millions of floating-point operations per Millions of floating point operations per
second, MFLOPS. Application specific feat res ( ideo a dio Application specific features (video, audio, control, communications).
Memory
FI-UBA 2010 Eng. Julian S. Bruno
Why DSP hardware?Why DSP hardware?
Special-purpose (custom) chips such as application-specific integrated circuits (ASIC).
Field-programmable gate arrays (FPGA). Field programmable gate arrays (FPGA). General-purpose microprocessors or microcontrollers (μP/μC). General-purpose digital signal processors (DSP processors).
FI-UBA 2010 Eng. Julian S. Bruno
DSP processors with application-specific hardware (HW) accelerators.
TI ProcessorsTI Processors
C6000™ Hi h P f DSP C6000™ High Performance DSPsIdeal for imaging, broadband infrastructure and performance audio applications.
C6000™ Performance Value DSPs C6000 Performance Value DSPsIdeal for broadband infrastructure and performance audio applications. Lower cost.
C6000™ Floating-point DSPsIdeal for professional audio products, biometrics, medical, industrial, digital imaging, speech recognition, conference phones and voice-over packetimaging, speech recognition, conference phones and voice over packet
C5000™ Power-Efficient DSPsOptimized for power- and cost-efficient embedded signal processing solutions
C2000™ 32-bit Real-time MCUsOptimized core can run multiple complex control algorithms at speeds
f d di t l li ti
FI-UBA 2010 Eng. Julian S. Bruno
necessary for demanding control applications
C5000™ DSP Platform RoadmapC5000 DSP Platform Roadmap
FI-UBA 2010 Eng. Julian S. Bruno
C6000™ DSP Platform RoadmapC6000 DSP Platform Roadmap
FI-UBA 2010 Eng. Julian S. Bruno
TI’s ARM Processor-BasedTI s ARM Processor BasedSitara™ ARM Microprocessors Sitara™ ARM Microprocessors Cortex™-A8 and ARM9™-based embedded microprocessors Clock Speed: 300 MHz to 1.5 GHz 3D Graphics Accelerator and Power technology (OMAP) 3D Graphics Accelerator and Power technology (OMAP)
Stellaris® MCU ARM Cortex-M3
Cl k S d U t 100 MH Clock Speed: Up to 100 MHz Up to 125 MIPS (at 100 MHz) Advanced integration: Serial interfaces, motion control, system, analogOMAP™ A li ti P OMAP™ Applications Processors ARM9-based devices. LP, general-purpose, multimedia and graphics
processing ARM® Cortex™ A8 core ARM® Cortex™-A8 core C64x+™ DSP and Video Accelerators 3-D Graphics AccelerationDaVinci™ Digital Media Processors DaVinci™ Digital Media Processors Optimized for digital video systems ARM9 only, ARM9 + DSP and DSP only.
FI-UBA 2010 Eng. Julian S. Bruno
TI’s ARM Processor-Based ProductsTI s ARM Processor Based Products
FI-UBA 2010 Eng. Julian S. Bruno
ADI ProcessorsADI Processors TigerSHARC® Processors TigerSHARC® Processors
32-bit fixed-point as well as floating-point Clock Speed: 250MHz to 600MHz 4.8 GMACs of 16-bit performance / 3.6 GFLOPs 4.8 GMACs of 16 bit performance / 3.6 GFLOPs 24 Mbits of on- chip memory 5 Gbytes of I/O bandwidth
SHARC® Processors SHARC® Processors 32-Bit floating-point Clock Speed: 150MHz to 400MHz / 2.4 GFLOPs. Accelerator Architecture: FIR, IIR, FFT.
Blackfin® Processors 16/32-bit fixed point Clock Speed: 200MHz to 756MHz / 1.5 GMACsp Very low power consumption: 0.23mW/Mhz RTOS supported. Multicore 600MHz / 2.4 GMACs.
ADSP-21xx Processors
FI-UBA 2010 Eng. Julian S. Bruno
16/32-bit fixed point Clock Speed: 75MHz to 160MHz Analog Devices brought first programmable processor to market in 1986
ADSP-21xx ProcessorsADSP-2191 BLOCK DIAGRAM
FI-UBA 2010 Eng. Julian S. Bruno
Blackfin ProcessorsADSP-BF536/ADSP-BF537 BLOCK DIAGRAM
FI-UBA 2010 Eng. Julian S. Bruno
SHARC ProcessorsADSP-2146x BLOCK DIAGRAM
FI-UBA 2010 Eng. Julian S. Bruno
TigerSHARC ProcessorgADSP-TS201S BLOCK DIAGRAM
FI-UBA 2010 Eng. Julian S. Bruno
Markets and ApplicationsMarkets and Applications
FI-UBA 2010 Eng. Julian S. Bruno
Recommended bibliographyRecommended bibliography
RG L U d t di Di it l Si l P i RG Lyons, Understanding Digital Signal Processing 2nd ed. Prentice Hall 2004. Ch2: Periodic Sampling Ch2: Periodic Sampling
SW Smith, The Scientist and Engineer’s guide to DSP. California Tech. Pub. 1997. Ch1: The Breadth and Depth of DSP Ch3: ADC and DAC SM K BH L R l Ti Di it l Si l P i SM Kuo, BH Lee. Real-Time Digital Signal Processing 2nd ed. John Wiley and Sons. 2006 Ch1:Introduction to Real-Time Digital Signal Processing Ch1:Introduction to Real Time Digital Signal Processing
NOTE: Many images used in this presentation were extracted from the
FI-UBA 2010 Eng. Julian S. Bruno
recommended bibliography.
Questions?
Thank you!Thank you!
FI-UBA 2010Eng. Julian S. Bruno