|
03-26-2017, 08:03 PM
|
#1
|
Human being with feelings
Join Date: Jan 2017
Posts: 43
|
Scaling Input Signals and DFT Output
Hey everyone.
I'm working on a custom IControl to do some simple spectral analysis, using the fft.c/fft.h files from WDL. So far I can pass the audio samples to the IControl, do the transform, and plot the results. What I'm unsure about is how the audio samples from the host and the results of the DFT should be scaled.
As I sort of understand it, the signals in ProcessDoubleReplacing() are supposed to be doubles between -1 and 1. However, really loud signals (screaming into my computers mic input, for example) will sometimes have samples with magnitudes greater than 1. My only thought is this corresponds to some sort of clipping/loudness standard? I've only tested the plug-in using Live. Are the input samples from the host standardized? If so, how should I interpret them?
In the case of DFT, has anyone else used the WDL fft class in one of their plugins? How are you scaling the output? I know there are often ambiguities in the normalization of the forward and reverse transforms.
Any information about these topics would be appreciated!
Thanks!
|
|
|
03-26-2017, 11:32 PM
|
#2
|
Human being with feelings
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,646
|
Are you using the complex (WDL_fft) or real version (WDL_real_fft)? Because these require different scaling.
Anyway, as to your more general question: Input is normalized to [-1.0, +1.0], and so should output I guess, but this is not very strict. Do note that anything beyond 1.0 will be clipped when it is converted to integer (e.g. because it is sent to an audio interface, or rendered to an integer file format).
|
|
|
03-27-2017, 10:27 AM
|
#3
|
Human being with feelings
Join Date: Dec 2016
Posts: 51
|
Quote:
Originally Posted by Tale
Are you using the complex (WDL_fft) or real version (WDL_real_fft)? Because these require different scaling.
Anyway, as to your more general question: Input is normalized to [-1.0, +1.0], and so should output I guess, but this is not very strict. Do note that anything beyond 1.0 will be clipped when it is converted to integer (e.g. because it is sent to an audio interface, or rendered to an integer file format).
|
I think he is referring to scaling the frequency, and how to window the FFT for pertinent information, which if I understand correctly is often ~1000Hz range
|
|
|
03-27-2017, 12:55 PM
|
#4
|
Human being with feelings
Join Date: Jan 2017
Posts: 43
|
Quote:
Originally Posted by Tale
Are you using the complex (WDL_fft) or real version (WDL_real_fft)? Because these require different scaling.
Anyway, as to your more general question: Input is normalized to [-1.0, +1.0], and so should output I guess, but this is not very strict. Do note that anything beyond 1.0 will be clipped when it is converted to integer (e.g. because it is sent to an audio interface, or rendered to an integer file format).
|
I'm currently using the complex FFT, but I don't see a reason not to use the real one, since it's faster for real input. I'll make the switch. How would you scale the output of the real transform?
Thanks!
Edit: Actually, I'll probably want to use some phase information, and as I understand it, that requires the complex transform... So any info about WDL_fft would also be appreciated.
Last edited by MSK; 03-27-2017 at 01:01 PM.
|
|
|
03-27-2017, 02:38 PM
|
#5
|
Human being with feelings
Join Date: Dec 2015
Posts: 331
|
Quote:
Originally Posted by MSK
I'm currently using the complex FFT, but I don't see a reason not to use the real one, since it's faster for real input. I'll make the switch. How would you scale the output of the real transform?
Thanks!
Edit: Actually, I'll probably want to use some phase information, and as I understand it, that requires the complex transform... So any info about WDL_fft would also be appreciated.
|
Don't have time to look at the FFT implementation, but you can always run something through it and see.
The complex (full) transform is for a complex signal—both use complex math. I think that's where you're getting confused. The real transform still has magnitude and phase info in the complex result. The real transform saves duplicate computation, since the Fourier transform of a real signals symmetrical.
|
|
|
03-27-2017, 11:52 PM
|
#6
|
Human being with feelings
Join Date: Jul 2008
Location: The Netherlands
Posts: 3,646
|
Quote:
Originally Posted by MSK
How would you scale the output of the real transform?
|
Mostly by 0.5/fft_size instead of 1/fft_size, but it depends on what exactly you are doing. Note that REAPER's JSFX also uses the WDL complex and real FFT, so most info here applies to the WDL FFT as well.
Unless you are indeed asking about frequency scaling (as CaptnWillie suggested), in that case there is no difference between the real and complex FFT.
|
|
|
03-29-2017, 04:43 PM
|
#7
|
Human being with feelings
Join Date: Dec 2016
Posts: 51
|
Quote:
Originally Posted by Tale
Unless you are indeed asking about frequency scaling (as CaptnWillie suggested), in that case there is no difference between the real and complex FFT.
|
Chatted with MSK, I misunderstood what he was suggesting in his initial question.
|
|
|
03-31-2017, 02:56 PM
|
#8
|
Human being with feelings
Join Date: Jan 2017
Posts: 43
|
Quote:
Originally Posted by Tale
Mostly by 0.5/fft_size instead of 1/fft_size, but it depends on what exactly you are doing. Note that REAPER's JSFX also uses the WDL complex and real FFT, so most info here applies to the WDL FFT as well.
Unless you are indeed asking about frequency scaling (as CaptnWillie suggested), in that case there is no difference between the real and complex FFT.
|
Thanks Tale!
I've got two more quick questions regarding conversion to dB.
If I wanted to create a peak meter, how should I turn the largest amplitude in an input buffer into a value in dB? Based on what I've read, I've used 20 * log10(max_amplitude), which to me looks like comparing the squared amplitude to reference value of 1. Is that the usual convention?
If I wanted to do a logarithmic plot of the transform (logarithmic in the magnitude of the Fourier coefficients, not the frequency) is there a similar convention? I haven't been able to turn up anything particularly illuminating in my research.
Thanks,
MSK
|
|
|
03-31-2017, 03:31 PM
|
#9
|
Human being with feelings
Join Date: Dec 2015
Posts: 331
|
Quote:
Originally Posted by MSK
If I wanted to create a peak meter, how should I turn the largest amplitude in an input buffer into a value in dB? Based on what I've read, I've used 20 * log10(max_amplitude), which to me looks like comparing the squared amplitude to reference value of 1. Is that the usual convention?
|
Yes, that's the definition of power in decibels. The bel (named after Alexander Graham Bell) is too big, so people use decibels. The bel is based on log 10, so you might expect 10 * log10(...), but we're talking power, and the magnitude value you have relates to voltage, so you need to square it. Or just pull it outside the log10, where it becomes a factor of 2, and that's why you're multiplying by 20 instead of 10.
Quote:
If I wanted to do a logarithmic plot of the transform (logarithmic in the magnitude of the Fourier coefficients, not the frequency) is there a similar convention? I haven't been able to turn up anything particularly illuminating in my research.
|
You'd want to plot dB here, so just convert your plot points to dB and plot linearly.
|
|
|
03-31-2017, 05:27 PM
|
#10
|
Human being with feelings
Join Date: Jan 2017
Posts: 43
|
Quote:
Originally Posted by earlevel
You'd want to plot dB here, so just convert your plot points to dB and plot linearly.
|
Awesome thanks! Would I want to use the same 20 * log10(coefficient) conversion? I'm not doing anything quantitative with it, so I guess it's really just the shape that's important.
Another few painfully general questions to annoy you guys:
->Is it normal practice (for plugins) to do any windowing on the input for performing the transform? If so, which window?
->What kind of buffer size would you collect before transforming to get a reasonable frequency resolution? I'm thinking of finding the closest power of 2 less than the number of samples per frame (sample_rate / fps).
->Would it be better do some averaging with transforms of smaller buffers?
Thanks so much for the information!
MSK
|
|
|
03-31-2017, 06:35 PM
|
#11
|
Human being with feelings
Join Date: Dec 2015
Posts: 331
|
Quote:
Originally Posted by MSK
Awesome thanks! Would I want to use the same 20 * log10(coefficient) conversion? I'm not doing anything quantitative with it, so I guess it's really just the shape that's important.
|
Yes. You want log conversion, scaling it by 20 isn't going to cost you anything, and now you're in dB.
Quote:
->Is it normal practice (for plugins) to do any windowing on the input for performing the transform? If so, which window?...
|
Don't have time to go into details and recommendations (and this is much-discussed on the web), but to the first, yes. An FFT assume that the data is periodic—the endpoints should meet. You need to window to smooth that and control "spectral leakage". Choice of window involves various tradeoffs, but for this use you'll probably go with Hanning (Hann) (I use Kaiser for sample rate conversion, but would go with Hanning for this). You could look into it deeper, but if you just chose to go with Hanning you wouldn't be wrong.
I haven't done an analyzer myself, but I'd start with 1024 (possibly half, maybe double for some cases, but probably not). And a healthy amount of overlap—like I said, this is heavily discussed, you'll find much more detailed info from a web search.
|
|
|
04-01-2017, 04:54 AM
|
#12
|
Human being with feelings
Join Date: May 2012
Location: PA, USA
Posts: 356
|
I have a pull request for WDL-OL for a rough example of a spectrum display. You can find it at https://github.com/witmerm/wdl-mw. In the examples folder, there is an IPlugSpectFFT project.
I am sure there is more work to be done with it, but it does include windowing, log scaling, as well as octave gain scaling. Typically, displays are scaled with a +3dB per octave scale to make the higher frequencies more discernible.
If you look at the example, just scrap the rest of the repo. I don't update it, and I am sure it is behind WDL-OL. I just created it so I could share the example.
I hope it helps, and please let me know if you find ways to improve it. I have a good working example in my current plugins, but I know that there has to be some optimizations that I am not using.
|
|
|
04-01-2017, 12:24 PM
|
#13
|
Human being with feelings
Join Date: Jan 2017
Posts: 43
|
Thanks earlevel!
Quote:
Originally Posted by random_id
I have a pull request for WDL-OL for a rough example of a spectrum display. You can find it at https://github.com/witmerm/wdl-mw. In the examples folder, there is an IPlugSpectFFT project.
I am sure there is more work to be done with it, but it does include windowing, log scaling, as well as octave gain scaling. Typically, displays are scaled with a +3dB per octave scale to make the higher frequencies more discernible.
If you look at the example, just scrap the rest of the repo. I don't update it, and I am sure it is behind WDL-OL. I just created it so I could share the example.
I hope it helps, and please let me know if you find ways to improve it. I have a good working example in my current plugins, but I know that there has to be some optimizations that I am not using.
|
Awesome, I'll try and implement this and give you any feedback I have!
|
|
|
Thread Tools |
|
Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -7. The time now is 06:10 AM.
|