SPECtral version 4 - Glennglenn.ws/SPEC 4.0 Public Release/SPECtral version 4.0.pdf · In version...
Transcript of SPECtral version 4 - Glennglenn.ws/SPEC 4.0 Public Release/SPECtral version 4.0.pdf · In version...
SPECtral version 4.0
1) SLICE – Slices the Stereo field up into SL, FL, C, RF, SR. An idea borrowed
from SMaCS, but this time using multiple stages of SPEC Center Cut.
2) CC – Center Cut. An algorithm borrowed from Center Cut GUI but
implemented using Plogue Spectral Bidules.
3) LCR – Algorithms that came before ambisonics, implemented using Plogue
Spectral Bidules.
4) ArCTan – Smoothly expands and spreads the stereo field up to 360 degrees
4.5) Zpan – A Constant Power Panner to allow you to widen or narrow your
sound field and make your speakers disappear into the sound field
Spectral Separation Experimental Converter (SPEC)
Glenn C. Newell
9-9-09
Changes Since SPEC 3.x
This section covers the changes between SPEC 4.0 and previous versions, as well as an external to SPEC
“ZPan” which can be used to “widen” or “narrow” the soundfield output by SPEC.
Both ArcTan and ZPan use Constant Power Panning calculations, Zpan in the time domain, and ArcTan in
the Spectral domain. Both have been written in C++ utilizing the Bidule SDK, to minimize DSP Load.
The ArcTan version inside SPEC 4.0 has been completely re-written, with all its core functions in C++.
The ArcTan algorithm is now implemented in a continuous and smooth way, as I originally envisioned it,
vs. earlier versions of ArcTan which used switching bidules vs. panning. I think you will be pleased with
the result. While it creates a very different sound from SLICE and the other SPEC methods, the quality is
very good and I suspect you will find it preferable to the other SPEC methods for certain types of music
or certain ways in which your source music was mixed by the producer. ArcTan also has an option to
blend in SLICE rears.
In version 4.0 the SLICE and CC methods have also been recreated using C++ and the Bidule SDK to
optimize DSP load.
Overview
“SPEC” is my attempt at conversion separation methods using Plogue’s spectral, or FFT, capabilities.
I was researching ways of bringing the Center Cut Algorithm “inside” Plogue, as a VST or Native plugin.
The Center Cut Algorithm uses math called “FHT”, but the author goes on to mention that he could have
used FFT as well.
I had already dabbled some with FFTs in Plogue, but not for separation, so this Center Cut research
inspired me to explore some more.
For version 1.x of SPEC, the CC algorithm was not possible, because there was no square root spectral
function, so this first version duplicated the math in the various LCR bidules, however done in the
Frequency vs. Time Domain.
In the Time Domain (what we usually work in, in Plogue. The blue lines) you have one number per
sample that represents the amplitude of the audio at that point in time. Methods like LCR do
multiplication, addition, and subtraction on those numbers.
FFT ( Fast Fourier Transform) is one method of taking Time Domain data and converting it into the
Frequency Domain (and iFFT puts it back).
In the Frequency Domain you have two sets of numbers to represent sound. The Frequency “Bin”
(Yellow line) and the Magnitude of the Frequency Bin (Orange lines).
So instead of a series of amplitudes, you have a series of Frequencies and Magnitudes. If your FFT size is
8192, then you will need 4096 pairs of Frequency and Magnitude data to describe a sound.
Some fun stuff can be done in the Frequency Domain such as changing the tempo of a song without
changing the pitch, or changing the pitch without changing the tempo. Great for DJs to match up beats
or make “mash ups”. However that is not our purpose here, but we can apply the same multiplication,
addition, and subtraction methods to the magnitude numbers, and then put things back in the Time
Domain and see how they sound. That is what SPEC does.
Since that first version, I used the Plogue Bidule SDK to create a spectral square root function, which
enabled me to use the “Center Cut” algorithm as the 2nd SPEC method, released as SPEC 2.0.
Next, mulling over further possibilities, I borrowed the concept of “slicing” the original stereo sound
field into pieces that become SL, FL, C, FR, and SR from SMaCS but did it by cascading parts of SPEC
Center Cut in two stages. The result is “SLICE”, the 3rd SPECtral method and release in SPEC 3.0.
SPEC 3.4.5 brought the optional addition of further stages of the CC Algorithm to extend the sound field
to a full 360 degrees or “WRAP”.
SPEC 4.0 brought the addition of the ArcTan method and (external to SPEC) Zpan, both bassed on
Constant Power Panning Algorithms.
ArcTan is a spectral method that uses the Arc Tangent of the ratio of the Left and Right frequency
magnitudes to determine the “angle” of the sound the producer created for that frequency in the stereo
field. This angle is then magnified to fill a 360 or smaller degree sound field. The pathagarium theorem is
used to calculate the magnitude of the frequency to output at the calculated angle. A Constant Power
Panner is used to “pan” each frequency between the appropriate pair of speakers to make the sound
appear to come from the calculated angle.
Zpan allows you to place the sound output from SPEC (or another method) between speakers vs. having
the sound come directly from the speakers. This allows you to widen or “open up” the sound (or narrow
it) as well as creating a balanced sound field in which your speakers “disappear”.
Note that the SPEC methods can be used either as complete methods in themselves, or as a group in a
larger layout, to create just the fronts, just the rears, just the center, feed ambisonic or other layouts
that take 5.0 or 5.1 inputs, etc. In short, play with it!
I hope you enjoy experimenting with SPEC.
Required plugin: LIVE PC http://www.box.net/shared/m4osaeu65h
Version 21 or higher.
Note: This is a Plogue Bidule Plugin, not a VST. The dll file goes in Bidule\plugins. If you don’t already
have that directory you need to create it.
The installation instructions for LIVE PC mention a MS C++ runtime, which has been updated since that
time. You need: Microsoft Visual C++ 2008 SP1 Redistributable Package installed:
http://www.microsoft.com/downloads/details.aspx?familyid=A5C84275-3B97-4AB7-A40D-3802B2AF5FC2&displaylang=en
Instead.
Live PC shows up in Plogue as “Live PC” on your palette menu.
Package Contents and Installation Instructions
1) The dll file in the VST folder goes in the VST folder plogue is pointed at (see your preferences if
you don't know).
2) The dll files in the plugins folder go in your plugins folder under Program Files\Plogue\Bidule. If
you don't already have a plugins folder you will need to make one or just drag this on to
Program Files\Plogue\Bidule.
3) The plugin for ArcTan was compiled with SSE2 extensions. If you should have a CPU that doesn't
support SSE2 extension (doubtfull) you can use the plugin in the non-SSE2 ArcTan folder instead
of the ArcTanPan.dll in the plugins folder (one or the other goes in your plugins folder, NOT
both!).
4) The file in the layout folder goes in your Files\Plogue\Bidule\layouts folder
5) Stop and restart Plouge and open the new layout.
Updated SPEC Instructions are also included in the group controls. Just double click on a SPEC 3.0 group
and then click on the "Instructions: Send Command" button.
Instructions :
Inputs:
1) Original Left
2) Original Right
Outputs*:
1) Front Left (Original Left but time delayed to match other outputs)
2) Front Right “ “ “ “ “
3) Center
4) LFE (If Bass Boost is on)
5) Surround Left
6) Surround Right
7) Delay (samples)
*In SLICE and ArcTan mode, all 5.0/5.1 outputs are unique (no “Original Left and Right). In CC mode, SL
and SR outputs correspond to the “SIDE” outputs of Center Cut Gui. In LCR mode, SL and SR outputs
correspond to LCR L and R outputs.
Group Controls:
1) Instructions – click this check box to open a short and crisp set of instructions.
2) Pre Gain – Input Gain (typically -4 to – 10 dB – Listen for distortion and/or right click on SPEC
and select monitor and watch for clipping in the output, or lower left six channels (Channel 7 is
the latency samples):
Adjust the pre-gain for no distortion or clipping at the output of SPEC.
3) Method Selection – You need to have processing turned on when you change methods. Check
the processing indicators, below the method selection drop down, to make sure you selection is
processing. Also note, you need to select the CC method when setting the FFT Latency. Once set,
change your method back to the one you want to use for conversion
4) Method Controls – Use these checkboxes to open or close the additional method specific
controls
5) FFT Window Type – All of these sound pretty similar except Rectangular. Choose the one you
like (Rectangular my require a more negative pre-gain than the others)
6) Check and Set FFT Latency
In Plogue version 0.9690 and earlier, there is a bug where the FFT Delay constant is incorrect. To get around this, SPEC includes its own "Latency Test" to set the FFT Delay time. To use it, turn processing ON, but stop any audio playback. Select the CC Method. Make sure the indicator says “Processing” next to CC. Then press the "Check and Set FFT Latency" button in the SPEC group controls. The Resulting number of samples will show. A result of Zero, or 44544 means you did something wrong. After saving the SPEC group, for use in other layouts, and saving the current layout you shouldn't have to do this again, unless you change your DSP FFT settings in Plogue Preferences. On faster computers, your DSP settings should be: FFT Window Size: 8192 FFT Overlap: 16 Higher Precision FFT: Checked However if that creates too much DSP load you can still get good quality with: FFT Window Size: 4096 FFT Overlap: 4 Higher Precision FFT: Checked The "overlap" seems to be what affects the DSP load the most. Note that if you are going to insist on up converting, you will need a higher FFT Window Size to achieve the same quality (hint hint).
7) Additional Bass Boost – Adds bass in the LFE channel (eek!), using HNM filter
8) Open Bass Boost Controls – Opens additional controls for Bass Boost
9) Output Gains – Use to control surround balance in standalone methods, or to control the levels
to other bidules in your layout. The FL and SL Gains will move the FR and SR gains as a pair, but
you can adjust the FR and SR gains independently if you need to
Method Specific Controls
SLICE
SLICE Method Stage Humidity Sliders – Stage one is the humidity setting for the input of stage 2,
affects FL, FR, SL, and SR. Stage two is the humidity setting for the SL and SR outputs. The Wrap
Humidity is enables when the Wrap Rears checkbox is checked
A Humidity of 1 = 100% wet, or SLICE processed signal. A Humidity of 0 = 0% wet or 100% dry
signal from the previous stage (or original left and right in the case of stage one). Humilities near
one are used to decrease and artifacts heard in the outputs.
Wrap Rears adds a third stage of separation. This has the effect of taking what was the extreme
outsides of the original stereo field and placing it in Both rear speakers, creating a virtual center
rear. This causes the sound field to wrap around you 360 degrees.
CC Controls
CC Method Humidity – Sets the Humidity for the SL and SR outputs
A Humidity of 1 = 100% wet, or SLICE processed signal. A Humidity of 0 = 0% wet or 100% dry
signal from the previous stage (or original left and right in the case of stage one). Humilities near
one are used to decrease and artifacts heard in the outputs.
Wrap Rears adds a third stage of separation. This has the effect of taking what was the extreme
outsides of the original stereo field and placing it in Both rear speakers, creating a virtual center
rear. This causes the sound field to wrap around you 360 degrees.
LCR Controls
LCR Method N+M Factors
Default settings for the N amd M factors are 1.0. These settings affect only the SL and SR output channels, but Psycho-acoustically they interact with all the other channels (and their output Gains). Some other settings to try: Gerzon: Set N to 0.885 and M to 0.115 (Try other values of N and M that add up to 1.0) LCR M: Set N to 1.0 and vary M. LCR G: Set N to 1.0 and vary M and the Center output gain inversely Note that SPEC produces different results than LCR, even with matching "M" settings.
ArcTan Controls and explanation
The SPEC 4.0 Control Pannel contains new controls for the ArcTan method.
When selecting methods within SPEC be sure processing is turned on and double check that the
indicators under the Method Selection show “Processing” for the selected method.
As with previous versions of SPEC, you must select the CC method and “Check and Set FFT Latency”
before selecting the method you will actually use for your conversion.
ArcTan Controls:
ArcTan is a spectral method that uses the Arc Tangent of the ratio of each of the Left and Right
frequency magnitude bins to determine the “angle” of the sound the producer created for that
frequency in the stereo field. This angle is then magnified to fill a 360 or smaller degree sound field using
the “Image Width” control.
The “Image Width” control can set from 0 to 360 degrees. A setting of zero will result in all of the sound
coming from only the center channel (assuming no “Adjacent Speaker” or “Blend Controls” are used). A
setting of 90 degrees should recreate the stereo image (but utilizing all three front speakers). A setting
equal to the larger angle between your rear speakers will spread the stereo image from center all the
way around to your rears, and a setting of 360 degrees will spread the stereo image 360 degrees around
you so that what was the extreme outsides of the stereo field appear to come from behind you,
between the rear speakers (as in SLCE or CC with “Wrap” turned on).
While any setting between zero and 360 degrees can be used, it is assumed that settings between the
larger angle between your rear speakers and 360 degrees (inclusive) would be used in conversions (for
instance if the larger angle between your rear speakers is 240 degrees, you would probably want to
experiment with “Image Width” settings between 240 and 360 degrees inclusive) .
In ArcTan you can further modify the sound field with the Rearward Bias control. This control changes
the relationship between the ArcTan angle (as modified by the width control) and the actual angle each
frequency bin is output at. With the control all the way to the left (1.0) the relationship is linear (no
effect) and as you move the control to the right the relationship becomes more and more exponential
(see the chart at the end of this guide). This has the effect of stretching the sound field or pushing more
of it toward the rear speakers from the front and center. The output angle is “capped” at the value of
the width control.
If you want to preserve the original bias of the sound field keep the Reward Bias control at 1.0. This
setting will also save a small amount of DSP load.
As you move the Rearward Bias control to the right you may find that the center channel sound “slips
away” from the center channel to one side or the other. In order to bring the “central” sound back the
center speaker a “Re-Center” control has been added. This simply adjusts the balance between the left
and right audio signals at the input to ArcTan. The Re-Center control is only needed in combination with
larger settings of the Rearward Bias control. Otherwise, Re-Center should be left at the center or 0.0
position.
360
Zero
Following the above described controls, ArcTan has three different modes of distributing the sound into
the surround field:
1) Pythagoras - The pathagarium theorem is used to calculate the magnitude of the frequency to
output at the calculated angle. This should be the most accurate reproduction/expansion of the
original mix
2) Across – The louder of the original left and right magnitude is output at the calculated angle,
and the quieter signal is output at an equal angle on the other side of zero degrees. This will give
a move “full” sound, with more sound being concentrated toward the front and center speakers
3) Diagonal - The louder of the original left and right magnitude is output at the calculated angle,
and the quieter signal is output at an angle 180 degrees from the calculated angle. This has the
interesting effect of putting sounds that are slightly off center behind you, in both rears. I’ve
found this to be particularly useful in songs where harmony vocals are panned just a little left or
right (vs. the lead vocal in the center). An example would be Sade’s “Lovers Rock” Album and
the song "Every Word" in particular.
Note that larger Rearward Bias settings are another possible way to spread out things panned
only slightly off center in the original stereo.
At the output of all the above controls a Constant Power Panner is used to “pan” each frequency
between the appropriate pair of speakers to make the sound appear to come from the targeted angle.
In order for the Constant Power Panner to perform the correct calculation, it needs to know where your
speakers are located.
Rather than assuming ITU speaker positions, ArcTan provides sliders for you to input the precise
speaker angles of your system. At the time of this writing, the effect of a playback system having
different speaker angles than the settings used in ArcTan during recording has not been tested.
Speaker angles are measured in degrees counter clockwise from the (normal) center channel position.
Note that in this layout the ArcTan speaker angles are linked from the ZPan speaker angles, so you can
set your speaker angles once in Zpan, and have ArcTan also be set correctly (the reverse is not true).
The last page of this document contains a 360 degree chart to assist you in measuring your speaker
positions.
This new C++ version of ArcTan retains the “Adjacent Speaker” and “Blend Controls” from previous
versions.
The adjacent speaker control lets you add in some signal destined for each speaker to the speakers on
either side of it. This can “fatten” or “fill in” the sound.
With the control all the way to the left (The default) this control is “off”.
Also included is a selection to have the opposite rear speaker included (Wrap Rears ON) or not.
Rather than simple humidity controls, ArcTan’s Blend Controls let you select the source of the “dry”
signal to blend with its outputs. The below screen shot shows the default settings. With all controls set
to 1.0 you will hear the “pure” output of the ArcTan method. Feel free to experiment.
The Rear Blend faders allow you blend in rears from the SLICE method. I have found that ArcTan fronts
and 33% blended SLICE rears (66% SLICE, 0.33 on the sliders) is an excellent combination for many types
of music. Using ArcTan AND SLICE at the same time does increas the DSP load however, so you should
keep the “Activate SLICE” drop down on “MUTE” if your not using SLICE blended rears.
ZPan
Included in this layout is a new bidule called Zpan.
Zpan allows you to place the sound output from SPEC (or another method) between speakers vs. having
the sound come directly from the speakers. This allows your widen or “open up” the sound (or narrow
it) as well as creating a balanced sound field in which your speakers “disappear”.
Thus Zpan can replace 5 channel input/output Ambisonic methods, and do so without adding any
significant DSP load to your layout.
Note that Zpan is probably not needed for ArcTan, as ArcTan already creates a continuous sound field,
vs. discrete speaker channels , but the other SPEC methods (SLICE, CC, LCR) can definitely benefit from
using ZPan.
Pan Angles are measured in degrees clockwise from center (positive) and degrees counter clockwise
from zero (negative). So the range of panning is from -180 degrees to 180 degrees. Note that both 180
degrees and -180 degrees create a pan angle that causes the sound to come from directly behind you,
between the rear speakers.
A negative pan angle equal to the angle from center to your left rear speaker would result in the sound
coming directly from your left rear speaker. The easiest way to get a feel for this is to use the Output
Gain controls in SPEC to turn off all but one channel, and then move the ZPan pan control for that
channel around to hear the result.
The “C Pan” Control has no effect unless you check off the “Pan Center ?” checkbox. This is to save DSP
load.
You will notice that similarly to the output gain controls in SPEC, the Zpan pan left and right pan controls
are linked so that if you move a “left” pan control the corresponding “right” pan control moves at the
same time. This allows you to smoothly hear the effect of widening or narrowing the front or rear sound
field by moving the LF or LS pan controls only.
You can set different pan angles for Left and Right speaker pairs by moving first the left Pan control and
then the corresponding right control.
You will also quickly notice that the control linking behavior is different from the SPEC Output gain
controls in that the pan controls move OPPOSITE to each other, and that this movement only occurs
when processing is turned on (Because we need to do math between the links).
In order for the Constant Power Panner to perform the correct calculation, it needs to know where your
speakers are located.
Rather than assuming ITU speaker positions, ZPan provides sliders for you to input the precise speaker
angles of your system. At the time of this writing, the effect of a playback system having different
speaker angles than the settings used in Zpan during recording has not been tested.
As with ArcTan, speaker angles are measured in degrees counter clockwise from the (normal) center
channel position.
One could assume that the target audience has ITU setups, in which case it might make sense to set the
speaker angles to ITU angles AFTER you have balanced our sound but before you do your conversion,
but again this assumption has not been tested (perhaps a poll of actual speaker angles would need to be
done to determine the average speaker positions?).
However in general the idea is that you set your speaker angles to match the position of your speakers
once, and only adjust the Pan Angles per conversion (or track).
Note that if the Pan Angles are set to match the position of your speakers ZPan has no effect on the
sound (remembering that pan angles are measured from -180 (left) to +180 (right) and speaker angles
are measured zero to 360 degrees counter clockwise).
Note that in this layout the ZPan speaker angles are linked to the ArcTan speaker angles, so you can set
your speaker angles once in Zpan, and have ArcTan also be set correctly (the reverse is not true).
A chart is included at the end of these instructions to help you set your speaker angles.
License:
SPEC is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License, http://creativecommons.org/licenses/by-sa/3.0/ (Attribution should be to “Glenn C. Newell”) and is work derived from other people's ideas, done in new ways. Specifically, SLICE and CC use the Center Cut Algorithm, from: http://www.virtualdub.org/blog/pivot/entry.php?id=102 and ArcTan and Zpan use Constant Power Panning Algorithms from: http://mue.music.miami.edu/thesis/jwest/Master.html With the angle diagrams also re-used/modified from that document. Portions of the C++ code come from the Plogue Bidule SDK 1.03: http://www.plogue.com/
Rearward Bias Chart:
-40.00
10.00
60.00
110.00
160.00
210.00
260.00
310.00
360.001 15 29 43 57 71 85 99
113
127
141
155
169
183
197
211
225
239
253
267
281
295
309
323
337
351
Ou
tpu
t A
ngl
e (
will
no
t ex
cee
d W
idth
)
Effect of Rearward Bias
Linear
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
ArcTan Angle(post Width)
Speaker Angle Chart:
Center Speaker
Listening
Position