TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada...
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada...
Overview
Recap GPS correlation Look at XCORRS instruction in detail
This was part of Take home quiz for 5005 Additional information on the web
Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm
code XcorrsTest.cpp – demonstrates testing of all the
functions being used Additional correlation presentations (not XCORRS)
from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC
XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors
GPS Positioning Concepts
(1)
For now make 2 assumptions: We know the distance to each satellite We know where each satellite is
With this information from 2 satellites – you know you are on a “plane of intersection.
Require 3 satellites for a 3-D position in this “ideal” scenario Requires 4 satellites to account for local receiver clock drift.
Determining Time
Use the PRN code to determine time Use time to determine distance to the satellite
distance = speed of light * time
(1)
Signal send by satellite
Signal received by you
You know the signal sent
Perform correlations till you get a match
The practice
Suppose we have the vector – in-phase and out-of-phase data gathered over an antenna from a satellite for example. Gain issues make it x16
-16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j, -16-16j 16+16j, 16+16j, etc
Question – if the original data from the satellite had this form -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j,
How much is the satellite data delayed? FOR THIS EXAMPLE …….. 0, 3, 6, 9, 12 etc
Tackle the issue with FIR
First – modify correlation function to handle complex values Ignore that issue at the moment
– 1 add + 1 multiplication + 2 memory fetches to 3 adds + 4 multiplications plus 4 memory fetches
Imagine 1024 data points + 1024 PRN Need to do 1024 FIR each of 1024 taps We know how to optimize to do 2 taps every cycle (one
in X and one in Y) Cycle time is 1024 * 512 cycles = 1 ms at 500 MHz
XCORS can do 8 * 16 taps each cycle in each compute block – 148 times faster
Where does the CLU fit in?
XCORRS definition
THEORYMathematicaldefinition
Uses registers
TR -- accumulateD -- 8 data?C -- 1 coefficient?
And something calledCUT – essentially awindow operation
fcut = 0 -- don’t use
2005 Lab. 4Satellite data
Quad fetch brings in8 complex values 8 bits eachPattern here is -1 + 0j, 1 + 0j, 1 + 0j, -1 + 0j, 1 + 0j, 1 + 0j, ……….
PRN code – 2 bit complex number
Seems strange to have two dummy bitsBut actually makes sense
PRN -1+ -1j, 1 + j, 1 + j, -1 + -1j, 1 + j, 1 + j, ……….
+1, -1 are associated with the PSK – more another lecture
Problem BINARY means 1 and 0, so how represent 1 and -1
-1 are stored as 1’s, +1 stored as 0’s (DAMY)
PRN
PRN
0x3 value go in asC15 and C160011 -- C15 = -1 –j C16 = +1 + j
Loading the THR registers
Standard XCORRS instruction
Lower 46 bits ofTHR1:0
R7:3
TR0, TR1, TR2 ……. TR15
TR15:0 = XCORRS(R7:4, THR3:0)
Doing 8 complex taps of 16 correlationat each cycle
TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps
64 taps each cycles – on both x and y compute blocks – if set up properly
128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)
Because of offsets, sometimes wemust only use “some of the taps”
TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)
TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps
………..TR15 += 0 … 0 taps
TR15:0 = XCORRS(R7:4, THR3:0) (CUT +7?)
TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps
………..TR15 += D7 * C7 + D6 * C7 + … 8 taps
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)
TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps
………..TR15 += 0 … 0 taps
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)
TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps
TR15:0 = XCORRS(R7:4, THR3:0)
TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps
64 taps each cycles – on both x and y compute blocks – if set up properly
128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3
Problem at this point -- THR3:2 emptyNeed to bring in more PRN values
TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15)
TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps
………..TR15 += D7 * C7 + D6 * C7 + … 8 taps
Final Result
Maximum correlation occurs every 3 shifts – which is what we expectIs it the correct result?
Correlation – result expected
In step-1 +0j, 1 + 0j, 1 + 0j, … 16 times
with-1 - j, 1 + j, 1 + j, … 16 times
-1 * -1 + 1 * 1 + 1 * 1 + 48 = 0x30 -- Real component
Out of step-1 +0j, 1 + 0j, 1 + 0j, … 16 times
with1 + j, 1 + j, -1 - j, … 16 times
-1 * 1 + 1 * 1 + 1 * -1 + -16 = -0x10 = 0xFFF0
Final Result
1) Now have correlation values for 16 shifts in TR registers – store to external memoryRepeat for all other necessary shifts – find the maximum2) Now make parallel in SISD mode 3) Now make parallel in SIMD
Overview
Recap GPS correlation Look at XCORRS instruction in detail
This was part of Take home quiz for 5005 Additional information on the web
Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm
code XcorrsTest.cpp – demonstrates testing of all the
functions being used Additional correlation presentations (not XCORRS)
from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC
XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors