Is there JSFX code to do convolution like ReaVerb?

TryingToMakeMusic · 01-13-2017, 06:17 PM

I'm working on a JSFX that manufactures an impulse-response, and after the IR is in memory, I want to convolve the IR with the source audio, similar to how ReaVerb convolves an IR with source audio.

I could implement my own brute-force convolution algorithm, but I doubt the CPU could keep up with it; so are there available examples of JSFX code doing convolution I could look at? My IR is over 65K, so the built-in JSFX functions won't help with this.

I've heard people often use FFT to make convolution more efficient, but doesn't that require some windowing, and doesn't that introduce inaccuracies?

mschnell · 01-14-2017, 12:15 AM

FFT is the only decent way to do convolution in software.

I rather did a project using FFT in JSFX. Works great !!!

I don't know if there is an any existing convolution ELL code, but in theory this is rather simple:

- in the initialization FFT convert the impulse file store the results in an array

- in realtime:
- do the (windowing aware) FFT for the audio stream (there are examples)
- multiply the spectrum with the converted impulse (AFAIK there is a "convolute" function that does the array of complex multiplications for you in high speed)
- do the inverse FFT (there are examples).

Windowing does affect the accuracy and the latency: small windows -> lower frequencies can't be handled, so not effect of the convolution to those, larger windows -> greater latency (and more CPU time).

What are you trying to accomplish ?
-Michael

TryingToMakeMusic · 01-14-2017, 12:40 AM

Quote:

Originally Posted by mschnell

FFT is the only decent way to do convolution in software.

I rather did a project using FFT in JSFX. Works great !!!

I don't know if there is an any existing convolution ELL code, but in theory this is rather simple:

- in the initialization FFT convert the impulse file store the results in an array

- in realtime:
- do the (windowing aware) FFT for the audio stream (there are examples)
- multiply the spectrum with the converted impulse (AFAIK there is a "convolute" function that does this for you in high speed)
- do the inverse FFT (there are examples).

Windowing does affect the accuracy and the latency: small windows -> lower frequencies can't be handled, so not effect of the convolution to those, larger windows -> greater latency (and more CPU time).

What are you trying to accomplish ?
-Michael

Thanks for the help. My IR is over 65K samples, though, and the JSFX docs say:

"Note that the FFT/IFFT must NOT cross a 65,536 item boundary, so be sure to specify the offset accordingly."

"Note that the convolution must NOT cross a 65,536 item boundary, so be sure to specify the offset accordingly."

I thought that meant I can't use those functions?

Since posting this, I've realized I can't use JSFX for this project. I'm doing massive computations in the @slider section, and at the rate JSFX is processing them, I need to port this over to C. (JSFX was awesome while I was prototyping though.)

I'm still curious about this windowing in convolution algorithms. I'm curious how badly the windowing messes up things when I use real-time convolvers like ReaVerb, Altiverb, etc. I'm sure ReaVerb and Altiverb are great at doing what they do in real-time, but if I can get better results by working offline, doing brute-force convolution, that's an option I'm considering too. I don't have a good idea of how bad the artifacts caused by windowing are.

Tale · 01-14-2017, 04:46 AM

You could split up your IR into smaller blocks, that is probably what ReaVerb also does.

I think I might have some JSFX code lying around somewhere implementing this. I could look it up, if you think that would help (or maybe I have already posted it? I can't remember).

TryingToMakeMusic · 01-14-2017, 06:57 AM

Quote:

Originally Posted by Tale

You could split up your IR into smaller blocks, that is probably what ReaVerb also does.

I think I might have some JSFX code lying around somewhere implementing this. I could look it up, if you think that would help (or maybe I have already posted it? I can't remember).

Please post it?

Xenakios · 01-14-2017, 07:12 AM

Quote:

Originally Posted by TryingToMakeMusic

I'm still curious about this windowing in convolution algorithms.

Convolution implemented with FFT is a special case of FFT processing where windowing is not needed. (So the results from FFT based convolution are effectively the same what you get from brute force convolution.)

Cockos's WDL C++ library has a convolution engine implementation which you might want to look into.

Justin · 01-14-2017, 07:49 AM

For short impulse responses, JS: amp-modeler does convolution. For impulse responses longer than the FFT size, you can do it without doing more FFTs on input samples, however you will do more convolve_c() calls (combining the input with later blocks of the impulse response's frequency domain -- as Tale mentions) and sum the results before the ifft().

Tale · 01-14-2017, 01:01 PM

FWIW, here is an example that splits up the IR into smaller blocks, and then uses the real FFT to do convolution.

Do note that it doesn't perform very good, especially at low audio buffer settings. But if you set your audio buffer high enough (512 samples or so), then it should be able to handle IRs up to 10 seconds.

Another issue is that it introduces a latency of fft_size/2-1 samples. In theory you could also fix this (as ReaVerb does when ZL is on), but I guesss I will leave that for another time...

Code:

desc:Frequency-domain Delay Line (FDL) convolution example

// Copyright (C) 2014-2020 Theo Niessink
// This work is free. You can redistribute it and/or modify it under the
// terms of the Do What The Fuck You Want To Public License, Version 2,
// as published by Sam Hocevar. See http://www.wtfpl.net/ for more details.

filename:0,ir.wav

@init

fft_size = 1024;

pdc_bot_ch = 0; pdc_top_ch = 2;
pdc_delay = fft_size/2 - 1;

function fft_real_align(index, size)
(
  // Align to size to avoid crossing 65,536 item boundary.
  (index & 65535) + size > 65536 ? ((index >> 16) + 1) << 16 : index;
);

function convolve_c(dest, src, size)
  local(re, im)
(
  size & 1 ? (
    size -= 1;

    re = src[0] * dest[0] - src[1] * dest[1];
    im = src[1] * dest[0] + src[0] * dest[1];

    dest[0] = re;
    dest[1] = im;

    src += 2;
    dest += 2;
  );

  convolve_c(dest, src, size);
);

function load_ir(handle, index, fft_size)
  global(block_size, srate)
  local(nch, sr, src, dest, scale, i, j, k, l, m, n, x)
(
  block_size = fft_size/2;
  n = 0;

  handle >= 0 ? (
    nch = sr = 0;
    file_riff(handle, nch, sr);
    // Mono
    nch == 1 && sr > 0 ? (

      // Read IR.
      l = max(file_avail(handle), 0);
      m = floor(l / sr * srate + 0.5);
      scale = m ? l/m;

      // 10 seconds max.
      n = min(ceil(m / block_size), ceil(srate / block_size * 10));

      src = index + n * fft_size;
      i = l > 0 ? file_mem(handle, src, l);
      i < l ? memset(src + i, 0, l - i);

      i = 0;
      loop(n,
        dest = index;

        // Resample block.
        l == m ? (
          memcpy(index, src + i, block_size);
          index += block_size;
          i += block_size;
        ) : loop(block_size,
          // Linear interpolation.
          x = i * scale;
          j = x|0;
          index[] = j < l ? (
            x -= j;
            ((1 - x) * src[j] + x * ((k = j + 1) < l ? src[k])) * scale;
          );
          index += 1;
          i += 1;
        );

        // Zero pad.
        memset(index, 0, block_size);

        // Convert to frequency domain.
        fft_real(dest, fft_size);

        index += block_size;
      );

      l > 0 ? memset(src, 0, l);
    );
    file_close(handle);
  );

  !n ? (
    n = 1;

    // Bypass
    index[0] = 1;
    memset(index + 1, 0, fft_size - 1);
    fft_real(index, fft_size);
  );

  n;
);

// Read/resample IR.
ir_buf = fft_real_align(0, fft_size);
num_blocks = load_ir(file_open(0), ir_buf, fft_size);

// Allocate input buffer.
in_buf = in_ptr = fft_real_align(ir_buf + num_blocks * fft_size, fft_size);
in_end = in_buf + num_blocks * fft_size;

// Allocate workspace buffers.
convolve_buf = fft_real_align(in_end, fft_size);
sum_buf = fft_real_align(convolve_buf + fft_size, fft_size);

// Allocate output buffer.
out_buf = sum_buf + fft_size;
out_scale = 0.25/fft_size;

@sample

// Buffer input.
in_ptr[i] = 0.5 * (spl0 + spl1);
(i += 1) >= block_size ? (

  // Zero pad.
  memset(in_ptr + block_size, 0, block_size);

  // Convert to frequency domain.
  fft_real(in_ptr, fft_size);

  // Complex multiply-accumulate blocks.
  memset(sum_buf, 0, fft_size);
  i = 0;
  loop(num_blocks,

    // Complex multiply.
    memcpy(convolve_buf, in_ptr, fft_size);
    ir_ptr = ir_buf + i * fft_size;
    convolve_buf[0] *= ir_ptr[0]; // DC
    convolve_buf[1] *= ir_ptr[1]; // Nyquist
    convolve_c(convolve_buf + 2, ir_ptr + 2, block_size - 1);

    // Complex add.
    j = 0;
    loop(fft_size,
      sum_buf[j] += convolve_buf[j];
      j += 1;
    );

    (i += 1) < num_blocks ? (in_ptr += fft_size) >= in_end ? in_ptr = in_buf;
  );

  // Convert back to time domain.
  ifft_real(sum_buf, fft_size);

  // Overlap-add.
  memcpy(out_buf, out_buf + block_size, block_size);
  memset(out_buf + block_size, 0, block_size);
  i = 0;
  loop(fft_size,
    out_buf[i] += sum_buf[i] * out_scale;
    i += 1;
  );

  i = 0;
);

spl0 = spl1 = out_buf[i];

mschnell · 01-14-2017, 02:51 PM

Quote:

Originally Posted by Tale

Another issue is that it introduces a latency of fft_size/2 samples. In theory you could also fix this (as ReaVerb does when ZL is on), but I guess I will leave that for another time...

??? I don't suppose you can do anything agains the latency, as FFT needs to handle each window completely. Extracting and inserting it necessarily introduces that latency.

I understand that setting ZL in ReaVerb does not fight the latency, it simply does not report it to reaper (so that it does not affect all audio streams while rendering), and if using it as a "parallel" effect such as an ambience / reverb algorithm, the latency then just is transformed to a pre-delay which does not harm greatly, but with straight (serial) "filters" the delay is mandatory.

Reporting the latency introduced by the plugin to Reaper is easily possible and tweakable in a JSFX, as well.

-Michael

Tale · 01-14-2017, 04:33 PM

Quote:

Originally Posted by mschnell

I understand that setting ZL in ReaVerb does not fight the latency, it simply does not report it to reaper

Not true, ZL in ReaVerb really does mean Zero Latency.

If you would like to know how, then I suggest you google "non-uniform partitioned convolution".

TryingToMakeMusic · 01-14-2017, 06:42 PM

Thanks everyone, I appreciate the examples.

Quote:

Originally Posted by Xenakios

Convolution implemented with FFT is a special case of FFT processing where windowing is not needed. (So the results from FFT based convolution are effectively the same what you get from brute force convolution.)

That is good news.

mschnell · 01-15-2017, 12:03 AM

Quote:

Originally Posted by Tale

Not true, ZL in ReaVerb really does mean Zero Latency.

If you would like to know how, then I suggest you google "non-uniform partitioned convolution".

OK I skimmed on article about this and it's an extremely interesting and complex issue.

But the drawings show that there always is FFT and IFFT in the signal path, so at least some delay (based on the block size of the FFT used in that step, can't be avoided. But regarding that any plugin needs to impose a certain small delay this maybe might considered "Zero", as it's masked by the latency imposed by any stream running in parallel.

Printing the article "Implementing Real-Time Partitioned Convolution ... - Eric Battenberg " ....

-Michael

Tale · 01-15-2017, 06:00 AM

Quote:

Originally Posted by mschnell

But the drawings show that there always is FFT and IFFT in the signal path, so at least some delay (based on the block size of the FFT used in that step, can't be avoided.

Well, you can still have delay (this will reduce CPU), but you don't have to. Actually you can also have zero-latency in my uniform partitioned convolution example. All you have to do is use brute-force for the first IR block, delay the FFT output by one block, and sum the outputs (see the reworked example below). For small IR sizes (e.g. cab IRs) this should work well.

A problem here is that a large block size is bad for CPU when doing brute-force, while a small block size is bad for CPU when doing FFT. This is where non-uniform partitioned convolution comes in, because you can use increasingly larger FFT sizes while you are accumulating more and more samples.

But this would be rather complex to implement, especially in JSFX, so again I will leave that for another time...

Code:

desc:Zero-latency FFT convolution example

// Copyright (C) 2014-2020 Theo Niessink
// This work is free. You can redistribute it and/or modify it under the
// terms of the Do What The Fuck You Want To Public License, Version 2,
// as published by Sam Hocevar. See http://www.wtfpl.net/ for more details.

filename:0,ir.wav

@init

fft_size = 256;

pdc_bot_ch = 0; pdc_top_ch = 2;
pdc_delay = 0;

function fft_real_align(index, size)
(
  // Align to size to avoid crossing 65,536 item boundary.
  (index & 65535) + size > 65536 ? ((index >> 16) + 1) << 16 : index;
);

function convolve_c(dest, src, size)
  local(re, im)
(
  size & 1 ? (
    size -= 1;

    re = src[0] * dest[0] - src[1] * dest[1];
    im = src[1] * dest[0] + src[0] * dest[1];

    dest[0] = re;
    dest[1] = im;

    src += 2;
    dest += 2;
  );

  convolve_c(dest, src, size);
);

function load_ir(handle, index, fft_size)
  global(block_size, srate)
  local(nch, sr, src, dest, scale, i, j, k, l, m, n, x)
(
  block_size = fft_size/2;
  n = 0;

  handle >= 0 ? (
    nch = sr = 0;
    file_riff(handle, nch, sr);
    // Mono
    nch == 1 && sr > 0 ? (

      // Read IR.
      l = max(file_avail(handle), 0);
      m = floor(l / sr * srate + 0.5);
      scale = m ? l/m;

      // 10 seconds max.
      n = min(ceil(m / block_size), ceil(srate / block_size * 10));

      src = index + n * fft_size;
      i = l > 0 ? file_mem(handle, src, l);
      i < l ? memset(src + i, 0, l - i);

      i = 0;
      loop(n,
        dest = index;

        // Resample block.
        l == m ? (
          memcpy(index, src + i, block_size);
          index += block_size;
          i += block_size;
        ) : loop(block_size,
          // Linear interpolation.
          x = i * scale;
          j = x|0;
          index[] = j < l ? (
            x -= j;
            ((1 - x) * src[j] + x * ((k = j + 1) < l ? src[k])) * scale;
          );
          index += 1;
          i += 1;
        );

        // Zero pad.
        memset(index, 0, block_size);

        // Skip first (time domain) IR block.
        i > block_size ? (
          // Convert to frequency domain.
          fft_real(dest, fft_size);
        );

        index += block_size;
      );

      l > 0 ? memset(src, 0, l);
    );
    file_close(handle);
  );

  !n ? (
    n = 1;

    // Bypass
    index[0] = 1;
    memset(index + 1, 0, fft_size - 1);
  );

  n;
);

// Read/resample IR.
ir_buf = fft_real_align(0, fft_size);
num_blocks = load_ir(file_open(0), ir_buf, fft_size);

// Allocate bute-force input buffer.
brute_buf = brute_ptr = ir_buf + num_blocks * fft_size;
 
// Allocate FFT input buffer.
in_buf = in_ptr = fft_real_align(brute_buf + 2 * block_size, fft_size);
in_end = in_buf + num_blocks * fft_size;

// Allocate FFT workspace buffers.
convolve_buf = fft_real_align(in_end, fft_size);
sum_buf = fft_real_align(convolve_buf + fft_size, fft_size);

// Allocate output buffer.
out_buf = sum_buf + fft_size;
out_scale = 0.25/fft_size;

@sample

// Buffer input.
(brute_ptr -= 1) < brute_buf ? (
  brute_ptr = brute_buf + block_size;
  memcpy(brute_ptr + 1, brute_buf, block_size - 1);
);
brute_ptr[] = in_ptr[i] = 0.5 * (spl0 + spl1);

// Brute-force convolution.
sum = j = 0;
loop(block_size,
  sum += brute_ptr[j] * ir_buf[j];
  j += 1;
);

// Output mix of brute-force and delayed FFT output.
spl0 = spl1 = sum + out_buf[i];

(i += 1) >= block_size ? (
  // Zero pad.
  memset(in_ptr + block_size, 0, block_size);

  // Convert to frequency domain.
  fft_real(in_ptr, fft_size);

  // Complex multiply-accumulate blocks.
  memset(sum_buf, 0, fft_size);

  // Skip first (time domain) IR block.
  i = 1;
  loop(num_blocks - 1,

    // Complex multiply.
    memcpy(convolve_buf, in_ptr, fft_size);
    ir_ptr = ir_buf + i * fft_size;
    convolve_buf[0] *= ir_ptr[0]; // DC
    convolve_buf[1] *= ir_ptr[1]; // Nyquist
    convolve_c(convolve_buf + 2, ir_ptr + 2, block_size - 1);

    // Complex add.
    j = 0;
    loop(fft_size,
      sum_buf[j] += convolve_buf[j];
      j += 1;
    );

    i += 1;
    (in_ptr += fft_size) >= in_end ? in_ptr = in_buf;
  );

  // Convert back to time domain.
  ifft_real(sum_buf, fft_size);

  // Overlap-add.
  memcpy(out_buf, out_buf + block_size, block_size);
  memset(out_buf + block_size, 0, block_size);
  i = 0;
  loop(fft_size,
    out_buf[i] += sum_buf[i] * out_scale;
    i += 1;
  );

  i = 0;
);

mschnell · 01-15-2017, 03:51 PM

I understand now:

The convolution impulse is divided in blocks.

You do the first block straight forward and the other blocks via FFT.

When doing the first block the resulting samples only depend on the samples received before. So no latency (I CPU calculation time is regarded as no problem)

The result of the second block calculated with the first block in the received stream is only needed when the second block is received. so no latency imposed by the FFT.

The result of the second block calculated with the second block in the received stream is only needed when the third block is received. so no latency imposed by the FFT.

And so on.

Nice !!!

Of course the straight forward convolution calculation will more than double CPU need, as you need to do a convolution (O(n²)) plus an FFT (O(n*ln(n))) on any block in the stream.

Your EEL code is amazingly short !

Don't you need the sinc window scaling here ?

-Michael

Tale · 01-15-2017, 04:50 PM

Quote:

Originally Posted by mschnell

Your EEL code is amazingly short !

And hopefully relatively easy to understand (mostly so I myself will understand it again when looking back on it, say, a year from now)... Actually it is too long already, mostly because of loading and resampling the IR, but I guess there are no shortcuts for that (except the ones I have already taken i.e. mono only and linear interpolation).

Quote:

Originally Posted by mschnell

Don't you need the sinc window scaling here ?

Well, I guess not, but TBH I don't really understand the question...

Xenakios · 01-15-2017, 05:59 PM

Quote:

Originally Posted by Tale

Well, I guess not, but TBH I don't really understand the question...

I guess he means you are not doing a "hi-fi" resampling, like a sinc based one in your code.

TryingToMakeMusic · 01-15-2017, 06:26 PM

I think he's contemplating this:

Quote:

Originally Posted by mschnell

Windowing does affect the accuracy...: small windows -> lower frequencies can't be handled....

mschnell · 01-15-2017, 11:14 PM

Quote:

Originally Posted by Tale

but TBH I don't really understand the question...

sinc -> https://en.wikipedia.org/wiki/Sinc_function

In another FFT example:

Code:

    w = 2.0*3.14159/fftsize;
    i = 0;
    loop(fftsize/2,
      spx = 0.42-0.50*cos(i*w)+0.08*cos(2.0*i*w);
      spx /= 2;
      window[i] = spx;
      i += 1;
    );

I supposed the overlapping windows transfer function would be based on sinc, (as I once read that sinc should be used to dim the rims with band limited Fourier Transform) but maybe I was wrong.

-Michael

Tale · 01-16-2017, 12:25 AM

Quote:

Originally Posted by mschnell

I supposed the overlapping windows transfer function would be based on sinc, (as I once read that sinc should be used to dim the rims with band limited Fourier Transform) but maybe I was wrong.

As Xenakios already said earlier in this thread, windowing is not needed in this case.

TryingToMakeMusic · 01-16-2017, 11:33 PM

Quote:

Originally Posted by Tale

FWIW, here is an example that splits up the IR into smaller blocks, and then uses the real FFT to do convolution....

I'm confused by this:

Code:

in_ptr[1] = 0; // Correct DC

Wouldn't DC be at index 0, and wouldn't index 1 contain the Nyquist component?

mschnell · 01-17-2017, 12:16 AM

Quote:

Originally Posted by Tale

As Xenakios already said earlier in this thread, windowing is not needed in this case.

I did read that, but (with other Fourier transform applications) the process of "dimming the edges" is necessary independently of a Windowing (i.e. sliced) implementation, as mathematically a digital Fourier transform necessarily is band limited as well in time as in frequency domain, and as "discrete" transforms to "periodic" in both directions as well in time as in frequency domain the curve needs to be regarded periodic. And without any band limiting and smoothing I understand the transformed data is greatly distorted by that.

Maybe for a convolution this effect is annihilated somehow (and hence also Windowing for slices is not necessary.

-Michael (maybe I was wrong taking "Windowing" and "slicing" for synonyms)

Tale · 01-17-2017, 12:26 AM

Quote:

Originally Posted by TryingToMakeMusic

I'm confused by this:

Code:

in_ptr[1] = 0; // Correct DC

Wouldn't DC be at index 0, and wouldn't index 1 contain the Nyquist component?

Yes, you are right. But I use convolve_c() to do complex multiplication on the entire block, so the DC imaginary part should hold its true value i.e. 0, not the real Nyquist component, or else this would mess up things. In the process I ditch the Nyquist bin.

I guess alternatively I could do the complex multiplication for DC and Nyquist manually (this should be easy because the imaginary parts are all 0), and use convolve_c() only for the other bins:

Code:

convolve_buf[0] *= ir_buf[0]; // DC
convolve_buf[1] *= ir_buf[1]; // Nyquist

convolve_c(convolve_buf + 2, ir_buf + i * fft_size + 2, block_size - 1);

TryingToMakeMusic · 01-17-2017, 12:38 AM

Quote:

Originally Posted by Tale

I use convolve_c() to do complex multiplication on the entire block....

I haven't yet figured out why convolve works as a substitute for complex-multiplication (or why convolve would be as efficient as complex-multiplication), but I've also noticed it in the Guitar/amp-model code, with straight complex-multiplication commented out:

Code:

fft(curblock,fftsize);
convolve_c(curblock,convsrc,fftsize);

/*
  i=0;
  loop(fftsize,
    r=curblock[i]; im=curblock[i+1];   
    cr=convsrc[i]; ci=convsrc[i+1];
    curblock[i]=r*cr-im*ci;
    curblock[i+1]=r*ci+im*cr;
    i+=2;
  );
*/

ifft(curblock,fftsize);

Quote:

Originally Posted by Tale

... so the DC imaginary part should hold its true value i.e. 0, not the real Nyquist component, or else this would mess up things. In the process I ditch the Nyquist bin.

I get it now. Thanks.

TryingToMakeMusic · 01-17-2017, 01:07 AM

Quote:

Originally Posted by TryingToMakeMusic

I haven't yet figured out why convolve works as a substitute for complex-multiplication (or why convolve would be as efficient as complex-multiplication), but I've also noticed it in the Guitar/amp-model code, with straight complex-multiplication commented out....

... I guess it depends on the undocumented action of the fft_permute function.

mschnell · 01-17-2017, 07:53 AM

Quote:

Originally Posted by TryingToMakeMusic

I haven't yet figured out why convolve works as a substitute for complex-multiplication (or why convolve would be as efficient as complex-multiplication),

I understand that convolve() just is a series of complex multiplications.

(Of course a description of the actual functionality of the fft, permute and convolve would be nice to have.)

-Michael

TryingToMakeMusic · 01-17-2017, 04:49 PM

Quote:

Originally Posted by mschnell

I understand that convolve() just is a series of complex multiplications.

Convolution is not equivalent to ordered pairwise complex multiplication.

TryingToMakeMusic · 01-17-2017, 05:03 PM

Quote:

Originally Posted by mschnell

Of course a description of the actual functionality of the fft, permute and convolve would be nice to have.

DFT and Convolution are well-defined in lots of sources I'm aware of. Permute is not, and yet some JSFX code depends on the definition of Permute.

mschnell · 01-17-2017, 11:09 PM

Quote:

Originally Posted by TryingToMakeMusic

Convolution is not equivalent to ordered pairwise complex multiplication.

Of course I do know what a convolution is

. I was meaning the EEL function that is called "convolve".

convolve() does not do a convolution. It does part of the calculations that are necessary to be done on the Fourier transformed digital data when exploiting the equation:

Convolution = Fourier-Transform -> Multiply -> Fourier-Transform.

-Michael

mschnell · 01-17-2017, 11:12 PM

Quote:

Originally Posted by TryingToMakeMusic

DFT and Convolution are well-defined in lots of sources I'm aware of. Permute is not, and yet some JSFX code depends on the definition of Permute.

Of course I do know what Fourier transform, DFT, FFT, real FFT, and convolution is . I was meaning the EEL function implementations that are called similarly.

-Michael

Justin · 01-17-2017, 11:43 PM

Yes, JSFX's convolve_c() function is perhaps misleadingly named -- it is just an accelerated way to do a block of complex multiplications.

TryingToMakeMusic · 01-18-2017, 01:46 AM

Quote:

Originally Posted by Justin

Yes, JSFX's convolve_c() function is perhaps misleadingly named -- it is just an accelerated way to do a block of complex multiplications.

Thanks for explaining. That solves some mysteries for me. JSFX convolve() is a useful function, with application as a convolution_helper, but it is not equivalent to the abstract operation of convolution as I've seen it defined in numerous popular DSP textbooks (as I'm sure you're familiar with).

And since that's what convolution() is doing, I'll stop wondering specifically what the fft_permute() function does. I figure it's a part of the FFT algorithm which is been separated out so it can be skipped when a programmer only needs to do pairwise multiplications or additions, as in some of the examples above.

TryingToMakeMusic · 01-18-2017, 01:49 AM

Quote:

Originally Posted by mschnell

convolve() does not do a convolution. It does part of the calculations that are necessary to be done on the Fourier transformed digital data when exploiting the equation:

Convolution = Fourier-Transform -> Multiply -> Fourier-Transform.

I understand now; thank you.

Justin · 01-20-2017, 09:05 PM

Quote:

Originally Posted by TryingToMakeMusic

I'll stop wondering specifically what the fft_permute() function does. I figure it's a part of the FFT algorithm which is been separated out so it can be skipped when a programmer only needs to do pairwise multiplications or additions, as in some of the examples above.

The output of fft(), and input of ifft(), is not in order -- often FFTs use a bit-reversed order, but the order used by our FFT function (djbfft) is another variation.. so basically if you want the first non-DC complex pair to be at buf+2, buf+3, you should permute(). If you don't need the order of the bins to be meaningful (e.g. you're taking two FFT'd signals and combining their bins), you can skip that step.

mschnell · 01-20-2017, 11:15 PM

So a pure convolution does not need permute.

For a similar argument pure convolution does not need taming the edges of the signal by applying a "windowing" function to tame the periodicity imposed by transforming a sampled signal: the reverse transform will automatically get rid of the effect.

Correct ?

(Meaning that I can switch off permuting and windowing with my spectral subtraction code in case I drop the drawing of the spectrum.)

-Michael

Tale · 01-21-2017, 02:40 AM

Quote:

Originally Posted by mschnell

For a similar argument pure convolution does not need taming the edges of the signal by applying a "windowing" function to tame the periodicity imposed by transforming a sampled signal: the reverse transform will automatically get rid of the effect.

Correct ?

Well, maybe not, because you will typically use overlap-add (at 2x the IR size) for convolution via FFT; I don't think you can just complex multiply at 1x the IR size.

TryingToMakeMusic · 01-21-2017, 04:23 AM

Quote:

Originally Posted by Justin

The output of fft(), and input of ifft(), is not in order -- often FFTs use a bit-reversed order, but the order used by our FFT function (djbfft) is another variation.. so basically if you want the first non-DC complex pair to be at buf+2, buf+3, you should permute(). If you don't need the order of the bins to be meaningful (e.g. you're taking two FFT'd signals and combining their bins), you can skip that step.

Thanks. How to use the JS FFT functions is clear to me now, thanks to everyone's help. I'll read up on "djbfft" too.

mschnell · 01-21-2017, 04:50 AM

Quote:

Originally Posted by Tale

Well, maybe not, because you will typically use overlap-add (at 2x the IR size) for convolution via FFT; I don't think you can just complex multiply at 1x the IR size.

I was referring to this post:

Quote:

Originally Posted by Xenakios

Convolution implemented with FFT is a special case of FFT processing where windowing is not needed.

-Michael

Justin · 01-21-2017, 09:38 AM

Quote:

Originally Posted by mschnell

(Meaning that I can switch off permuting and windowing with my spectral subtraction code in case I drop the drawing of the spectrum.)

AFAIK this depends on what you're doing, there is probably some good rule for determining this (linearity perhaps?).

For things like spectral subtraction, what seems to work well is to do the following:

copy FFTSIZE samples of input to buffer, advance input by FFTSIZE/2
fft(FFTSIZE)
permute() if needed (if your other buffer is permuted as well and you're just subtracting bins, you can skip this)
mangle (subtract, whatever)
inverse permute() if needed
ifft(FFTSIZE)
Fade first FFTSIZE/2 with second FFTSIZE/2 samples of last block (using a windowing function, linear or hann or whatever), copy to output
Save second FFTSIZE/2 samples for the next block

If the mangle is simple enough, I think you can set the second FFTSIZE/2 of input be zero, and use a plain sum for the Fade function, because the errors introduced cancel each other out. But I forget (err never fully understand) the math on that.

mschnell · 01-21-2017, 04:43 PM

My spectral subtraction JSFX already works great, I am just researching if some optimization is possible.

And you of course are right it depends on what I am doing.

I use real FFT, that gets rid of half of the samples (the imaginary part in time domain and the upper half of the samples (> Nyquist) in frequency domain), anyway.

I suppose I do need windowing, as the algorithm is not at all linear, as it handles amplitude and phase differently.

For the basic stuff I don't need permutation, as all in and out signals are handled in the same way, combining the exactly corresponding bins.

But I show some spectrum graphs. Here of course I do need permutation.

Moreover I impose some spectrum analyzing algorithm, and here the smoothing uses neighbor bis. So permutation is even necessary when switching off the graphs.

Thanks for your thoughts,
-Michael

Tale · 01-22-2017, 12:44 PM

Quote:

Originally Posted by Justin

But I forget (err never fully understand) the math on that.

Me neither, but I think that maybe this works only if your "mangle" function is static...

FWIW, I have updated both my examples so they now use normal multiplication for both DC and Nyquist, and complex multiplication for all other bins. I have also (again) corrected the latency in the first example (to fft_size/2-1).