TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas...
-
Upload
wesley-clarke -
Category
Documents
-
view
243 -
download
2
Transcript of TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas...
![Page 1: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/1.jpg)
TMS320C6000 DSP Optimization Workshop
Chapter 10
Advanced Memory Management
Copyright © 2005 Texas Instruments. All rights reserved. Technical Training
Organization
T TO
![Page 2: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/2.jpg)
Outline
Using Memory Efficiently Keep it on-chip Use multiple sections Use local variables (stack)
Using dynamic memory (heap, BUF)
Overlay memory (load vs. run)
Use cache Summary
![Page 3: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/3.jpg)
Keep it On-Chip
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
.text
.bss
Using Memory Efficiently
1. If Possible …
Put all code / data on-chip Best performance Easiest to implement
What if it doesn’t all fit?Technical Training
Organization
T TO
![Page 4: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/4.jpg)
How to use Internal Memory Efficiently
1. Keep it on-chip
2. Use multiple sections
3. Use local variables
(stack)
4. Using dynamic memory
(heap, BUF)
5. Overlay memory
(load vs. run)
6. Use cache
Technical TrainingOrganization
T TO
![Page 5: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/5.jpg)
Use Multiple Sections
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
Using Memory Efficiently
2. Use Multiple Sections Keep .bss (global vars) and
critical code on-chip Put non-critical code and
data off-chip
.text
.bss
.far
critical
myVar
Technical TrainingOrganization
T TO
![Page 6: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/6.jpg)
Making Custom Code Sections
#pragma CODE_SECTION(dotp, “critical”);int dotp(a, x)
Create custom code section using
#pragma CODE_SECTION(dotp, “.text:_dotp”);
Use the compiler’s –mo option -mo creates a subsection for each function Subsections are specified with “:”
To make a data section ...Technical TrainingOrganization
T TO
![Page 7: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/7.jpg)
Making Custom Data Sections
A special data section ...
#pragma DATA_SECTION (x, “myVar”);#pragma DATA_SECTION (y, “myVar”);int x[32];short y;
Make custom named data section
Technical TrainingOrganization
T TO
![Page 8: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/8.jpg)
Special Data Section: “.far”
#pragma DATA_SECTION(m, “.far”)short m;
.far is a pre-defined section name Three cycle read (pointer must be set before read) Add variable to .far using:
1. Use DATA_SECTION pragma
2. Far compiler option
3. Far keyword:
How do we link our own sections?
-ml
far short m;
Technical TrainingOrganization
T TO
![Page 9: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/9.jpg)
Linking Custom Sectionsapp.cdb
Linker
appcfg.cmd
myApp.out
“Build”
How do I know which CMD file is executed first?
myLink.cmd
SECTIONS { myVar: > SDRAM critical: > IRAM .text:_dotp:> IRAM}
Technical TrainingOrganization
T TO
![Page 10: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/10.jpg)
Specifying Link Order
What if I forget to specify a section in SECTIONS?Technical TrainingOrganization
T TO
![Page 11: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/11.jpg)
Check for Unspecified Sections
In summary …Technical TrainingOrganization
T TO
![Page 12: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/12.jpg)
Use Multiple Sections
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
Using Memory Efficiently
.text
.bss
.far
critical
myVar
2. Use Multiple Sections Keep .bss (global vars) and
critical code on-chip Put non-critical code and
data off-chip Create new sections with:
#pragma CODE_SECTION
#pragma DATA_SECTION You must make your own
linker command file
Technical TrainingOrganization
T TO
![Page 13: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/13.jpg)
Using Memory Efficiently
1. Keep it on-chip
2. Use multiple sections
3. Use local variables
(stack)
4. Using dynamic memory
(heap, BUF)
5. Overlay memory
(load vs. run)
6. Use cache
Technical TrainingOrganization
T TO
![Page 14: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/14.jpg)
Dynamic Memory
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
Using Memory Efficiently
3. Local Variables If stack is located on-chip,
all functions can “share” it
Stack
What is a stack?Technical TrainingOrganization
T TO
![Page 15: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/15.jpg)
Top of Stack
0
0xFFFFFFFF
What is the Stack
A block of memory where the compiler stores:
Local variables Intermediate results Function arguments Return addresses
Details of the C6000 stack ...Technical TrainingOrganization
T TO
![Page 16: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/16.jpg)
(lower)
(higher)
stack grows
Details: 1. SP points to first empty location2. SP is double-word aligned before each fcn3. Created by Compiler’s init routine (boot.c)4. Length defined by -stack Linker option5. Stack length is not validated at runtime
SPB15
Top of Stack
0
0xFFFFFFFF
Stack and Stack Pointer
Technical TrainingOrganization
T TO
![Page 17: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/17.jpg)
Dynamic Memory
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
Using Memory Efficiently
Stack
Heap
3. Local Variables If stack is located on-chip,
all functions can use it
4. Use the Heap Common memory reuse
within C language A Heap (ie. system memory)
allocate, then free chunks of memory from a common system block
For example …Technical TrainingOrganization
T TO
![Page 18: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/18.jpg)
Dynamic Example (Heap)
#define SIZE 32
int x[SIZE]; /*allocate*/
int a[SIZE];
x={…}; /*initialize*/
a={…};
filter(…); /*execute*/
“Normal” (static) C Coding
#define SIZE 32
x=malloc(SIZE);
a=malloc(SIZE);
x={…};
a={…};
filter(…);
free(a);
free(x);
“Dynamic” C Coding
Create
Execute
Delete
High-performance DSP users have traditionally used static embedded systems As DSPs and compilers have improved, the benefits of dynamic systems often
allow enhanced flexibility (more threads) at lower costs
![Page 19: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/19.jpg)
Dynamic Memory
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
Using Memory Efficiently
Stack
Heap
3. Local Variables If stack is located on-chip,
all functions can use it
4. Use the Heap Common memory reuse
within C language A Heap (ie. system memory)
can be allocated, then free’d
What if I need two heaps? Say, a big image array off-chip, and Fast scratch memory heap on-chip?
What if I need two heaps? Say, a big image array off-chip, and Fast scratch memory heap on-chip?
Technical TrainingOrganization
T TO
![Page 20: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/20.jpg)
Multiple Heaps
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
Stack
Heap
Heap2
DSP/BIOS enables multiple heaps to be created
![Page 21: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/21.jpg)
Multiple Heaps with DSP/BIOS DSP/BIOS enables multiple
heaps to be created
Check the box & set the size
when creating a MEM object
![Page 22: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/22.jpg)
Multiple Heaps with DSP/BIOS DSP/BIOS enables multiple
heaps to be created
Check the box & set the size
when creating a MEM object
By default, the heap has the
same name as the MEM obj,
You can change it here
How can you allocate from multiple heaps?Technical Training
Organization
T TO
![Page 23: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/23.jpg)
MEM_alloc()
#define SIZE 32
x = MEM_alloc(IRAM, SIZE, ALIGN);
a = MEM_alloc(SDRAM, SIZE, ALIGN);
x = {…};
a = {…};
filter(…);
MEM_free(SDRAM,a,SIZE);
MEM_free(IRAM,x,SIZE);
Using MEM functions
#define SIZE 32
x=malloc(SIZE);
a=malloc(SIZE);
x={…};
a={…};
filter(…);
free(a);
free(x);
Standard C syntax
You can pick a specific heap
Technical TrainingOrganization
T TO
![Page 24: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/24.jpg)
BUF Concepts
Buffer pools contain a specified number of equal size buffers Any number of pools can be created Buffers are allocated from a pool and freed back when no longer needed Buffers can be shared between applications Buffer pool API are faster and smaller than malloc-type operations In addition, BUF_alloc and BUF_free are deterministic (unlike malloc) BUF API have no reentrancy or fragmentation issues
POOLBUF BUF BUF BUF BUF
SWI
BUF_alloc
BUF
TSK
BUF_free
BUF BUF BUF BUF
BUF_create BUF_delete
Technical TrainingOrganization
T TO
![Page 25: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/25.jpg)
GCONF Creation of Buffer PoolCreating a BUF1. right click on BUF mgr2. select “insert BUF”3. right click on new BUF4. select “rename”5. type BUF name6. right click on new BUF7. select “properties”8. indicate desired • Memory segment• Number of buffers• Size of buffers• Alignment of buffers• Gray boxes indicate
effective pool and buffer sizes
Technical TrainingOrganization
T TO
![Page 26: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/26.jpg)
Using Memory Efficiently
1. Keep it on-chip
2. Use multiple sections
3. Use local variables
(stack)
4. Using dynamic memory
(heap, BUF)
5. Overlay memory
(load vs. run)
6. Use cache
Technical TrainingOrganization
T TO
![Page 27: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/27.jpg)
Use Memory Overlays
InternalSRAM
CPU
ProgramCache
DataCache
EMIF
External Memory
algo2
algo1
Using Memory Efficiently
5. Use Memory Overlays Reuse the same memory
locations for multiple algorithms (and/or data)
You must copy the sections yourself
First, we need to make custom sections?Technical TrainingOrganization
T TO
![Page 28: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/28.jpg)
Create Sections to Overlay
#pragma CODE_SECTION(fir, “.FIR”);int fir(short *a, …)
#pragma CODE_SECTION(iir, “myIIR”);int iir(short *a, …)
myCode.C
How can we get them to run from the same location?
Where will they be originally loaded into memory?
The key is in the linker command file …
Technical TrainingOrganization
T TO
![Page 29: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/29.jpg)
Load vs. Run Addresses
SECTIONS{ .FIR:> IRAM /*load & run*/ myIIR: load=IRAM, run=IRAM
InternalSRAM
External Memory
.fir
myIIR
Simply directing a section into a MEM obj indicates it’s both the load & run from the same location
.FIR:> IRAM
Alternatively, you could use:
.FIR: load=IRAM, run=IRAM In your own linker cmd file:
load: where the fxn resides at reset
run: tells linker its runtime location
What if we wanted them be loaded to off-chip but run from on-chip memory?
![Page 30: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/30.jpg)
Load vs. Run Addresses
Simply specify different addresses for load and run
You must make sure they get copied (using the memcopy or the DMA)
loadaddresses
runaddresses
load: where the fxn resides at reset run: tells linker its runtime location
SECTIONS{ .FIR: load=SDRAM,run=IRAM myIIR: load=SDRAM,run=IRAM
Internal
SRAMExternal Memory
.FIR
myIIR
Back to our original problem, what if we want them to
run from the same address?
![Page 31: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/31.jpg)
SECTIONS{ .FIR: load=SDRAM,run=IRAM myIIR: load=SDRAM,run=IRAM
Combining Run Addresses with UNION
Above, we only force different load/run
Below, we also force them to share (union) run locations
loadaddresses
runaddresses
SECTIONS{ UNION run = IRAM { .FIR : load = EPROM myIIR: load = EPROM }
Internal
SRAMExternal Memory
How can we make the overlay procedure easier?
![Page 32: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/32.jpg)
SECTIONS{ UNION run = IRAM { .FIR : load = EPROM, table(_fir_copy_table) myIIR: load = EPROM, table(_iir_copy_table) }}
Using Copy Tables
![Page 33: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/33.jpg)
SECTIONS{ UNION run = IRAM { .FIR : load = EPROM, table(_fir_copy_table) myIIR: load = EPROM, table(_iir_copy_table) }}
Using Copy Tables
typedef struct copy_record{ unsigned int load_addr;
unsigned int run_addr;unsigned int size;
} COPY_RECORD;
typedef struct copy_table{ unsigned short rec_size;
unsigned short num_recs;COPY_RECORD recs[2];
} COPY_TABLE;
fir_copy_table 31fir load addr
copy record fir run addrfir size
iir_copy_table 31iir load addr
copy record iir run addriir size
How do we use a Copy Table?Technical TrainingOrganization
T TO
![Page 34: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/34.jpg)
SECTIONS{ UNION run = IRAM { .FIR : load = EPROM, table(_fir_copy_table) myIIR: load = EPROM, table(_iir_copy_table) }}
Using Copy Tables
#include <cpy_tbl.h>extern far COPY_TABLE fir_copy_table;extern far COPY_TABLE iir_copy_table;extern void fir(void);extern void iir(void);
main(){ copy_in(&fir_copy_table); fir(); ...
copy_in(&iir_copy_table); iir(); ...}
copy_in() provides a simple wrapper around mem_copy().
Better yet, use the DMA hardware to copy the sections; specifically, the DAT_copy() function.
copy_in() provides a simple wrapper around mem_copy().
Better yet, use the DMA hardware to copy the sections; specifically, the DAT_copy() function.
What could be even easier than using Copy Tables?What could be even easier than using Copy Tables?
![Page 35: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/35.jpg)
Use Cache
InternalCache
CPU
ProgramCache
DataCache
EMIF
External Memory
.bss
.text
Using Memory Efficiently
6. Use Cache Works for Code and Data Keeps local (temporary)
scratch copy of info on-chip Commonly used, since once
enabled it’s automatic Discussed further in
Chapter 14
Technical TrainingOrganization
T TO
![Page 36: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/36.jpg)
Summary: Using Memory Efficiently
You may want to work through your memory allocations in the following order:
1. Keep it all on-chip
2. Use Cache
(more in Ch 15)
3. Use local variables
(stack on-chip)
4. Using dynamic memory
(heap, BUF)
5. Make your own sections
(pragma’s)
6. Overlay memory
(load vs. run)
While this tradeoff is highly application dependent, this is a good place to start
Technical TrainingOrganization
T TO
![Page 37: TMS320C6000 DSP Optimization Workshop Chapter 10 Advanced Memory Management Copyright © 2005 Texas Instruments. All rights reserved. Technical Training.](https://reader031.fdocuments.us/reader031/viewer/2022013122/56649f125503460f94c25a18/html5/thumbnails/37.jpg)
ti
Technical TrainingOrganization