formant filter / speech synthesis

cern.th.skei · 01-01-2009, 11:10 PM

has anyone done any formant filters in JS, or have any info, pointers, url's for learning more?

most of the info i've found so far (via google, etc) is either just scratching the surface, or very mathematically (i understand code and algorithms, but not much of those weird symbols or academic language, heh)

i have a quite crappy prototype up and running, but it's ported from some 4k intro souce code, and therefore heavily optimized for size and simplicity, not quality or 'understandability'...

but it gave me a lot of ideas and inspiration

my plan is to define a set of phonemes using noise/saw/square oscillator and formant filters, and some additional stuff for plosives and stuff. then select/play those via midi notes and/or controllers.

- ccernn

liteon · 01-03-2009, 02:39 PM

Quote:

Originally Posted by cern.th.skei

has anyone done any formant filters in JS, or have any info, pointers, url's for learning more?

most of the info i've found so far (via google, etc) is either just scratching the surface, or very mathematically (i understand code and algorithms, but not much of those weird symbols or academic language, heh)

i have a quite crappy prototype up and running, but it's ported from some 4k intro souce code, and therefore heavily optimized for size and simplicity, not quality or 'understandability'...

but it gave me a lot of ideas and inspiration

my plan is to define a set of phonemes using noise/saw/square oscillator and formant filters, and some additional stuff for plosives and stuff. then select/play those via midi notes and/or controllers.

- ccernn

I'm not sure there are any formant filters in the distro. There are some very good examples over the net. The midi part is easy. However I'm little bit sceptical that you may end up in need to understand the "weird symbols or academic language" to do the phonemes and plosives :-D. Tbh I've never done any research into speech synthesis. But this article pretty much sums it up for me. Its matlab:
http://ccrma-www.stanford.edu/~jos/f...g_Example.html
(I've read something similar before)

Maybe if you share your source code here more people would join and give some directions.

liteon

Edit:
This also looks well explained. A phonemes table is suggested.
http://www.ddj.com/184409295
also this c++ lib seems to have some formant f:
http://www-ccrma.stanford.edu/~rjc/p...ter_Based.html

cern.th.skei · 01-03-2009, 03:42 PM

thanx!

yeah, guess i'll have to learn more academic math syntax. i understand some of it, but when things get very abstract and up-there, i'm a bit lost. feel much better with textual explanations, and/or source code. but, well, never a bad thing to learn new stuff

what i've been 'porting' is based on this:
http://www.pouet.net/prod.php?which=50530

mostly trying to extract the filtering first, and planning/hoping to add the phoneme stuff after (if?) things are somewhat working..

here's what i've got so far.
http://dl.getdropbox.com/u/249632/cc...p/formant-test

haven't done too much with it (as in, that specific version might not work properly, and it's a bit chaotic in there). my son have been visiting me for the holiday, and i've been in a father/family mood

- ccernn

liteon · 01-03-2009, 05:40 PM

Quote:

Originally Posted by cern.th.skei

thanx!

yeah, guess i'll have to learn more academic math syntax. i understand some of it, but when things get very abstract and up-there, i'm a bit lost. feel much better with textual explanations, and/or source code. but, well, never a bad thing to learn new stuff

what i've been 'porting' is based on this:
http://www.pouet.net/prod.php?which=50530

mostly trying to extract the filtering first, and planning/hoping to add the phoneme stuff after (if?) things are somewhat working..

here's what i've got so far.
http://dl.getdropbox.com/u/249632/cc...p/formant-test

haven't done too much with it (as in, that specific version might not work properly, and it's a bit chaotic in there). my son have been visiting me for the holiday, and i've been in a father/family mood

- ccernn

Well you can still use this source code - the phon table and the filter thats in there I find some of the other things a bit confusing tbh. Interesting results with those different modes on - the first mode sounds clearer. As the original is still wip with some tweaks you can make the js version sound even better.

Probably the biggest difficulty for me would be to trace all those C pointers in there. The filter part is looking good so far. The parameters are contributing.

But this right here for example is a complete bugger for me :-). I can only understand that its "Overlaping neighbour phonemes". There is definitely not enough info for this one... trial/error I guess :-)

Code:

// Overlap neighbour phonemes
buf += ((3*sl/4)<<1);
if ( p->Shape.plosive ) buf += (sl&0xfffffe);

You can also give some other versions a go. (that lib version I posted above maybe)

This is the most simplified formant filter code:
http://www.musicdsp.org/showArchiveC...?ArchiveID=110
Coefficients are pre-calculated tho.

I like the explanation of the formant filters here:
http://axefxwiki.guitarlogic.org/ind...Formant_filter

liteon

cern.th.skei · 01-13-2009, 08:19 PM

just a quick update, things are going forward. just a prototype, a p.o.cc, gibberish text yet, but i think it can become much better with a bit of tweaking and proper formant/phoneme list

http://cernthskei.wordpress.com/2009...tzel-phonemes/

- ccernn

cern.th.skei · 01-14-2009, 08:01 PM

another update, a small preview:
http://dl.getdropbox.com/u/249632/cc...r_can_sing.mp3
- ccernn

liteon · 01-15-2009, 06:23 AM

Hehe I.A.M.R.O.B.O.T

Its getting better and better :-)

LOSER · 01-15-2009, 09:04 AM

Quote:

Originally Posted by cern.th.skei

another update, a small preview:
http://dl.getdropbox.com/u/249632/cc...r_can_sing.mp3
- ccernn

That's awesome!

Justin · 01-15-2009, 10:52 AM

mmm that's really rad!

beingmf · 06-23-2009, 09:04 AM

must...have...this... other.wise...must...destroy...internet...

cern.th.skei · 06-24-2009, 10:51 AM

i have been planning to continue with the speech synth stuff, but wanted to find a good way to control all the phonemes and plosives and filters, etc... in realtime. i have some ideas (and i've learned a few things, and some tricks), so hopefully there will be an update, or a completely new, playable (speakable?) version soon.
- ccernn

chip mcdonald · 06-26-2009, 08:30 AM

It seems to me the "additive synthesis" approach everyone takes to speech synthesis is a ... terminally complex approach.

Has anyone tried to model the entire vocal tract, and get phonemes out of altering the model?

cern.th.skei · 06-26-2009, 09:28 AM

Quote:

Originally Posted by chip mcdonald

It seems to me the "additive synthesis" approach everyone takes to speech synthesis is a ... terminally complex approach.

Has anyone tried to model the entire vocal tract, and get phonemes out of altering the model?

the model i'm using isn't that complex really. a saw oscillator for pitched voice, a noise generator for hissing sounds (aspiration, fricatives, etc), and a few bandpass filters (5) for shaping these sources (formants). a very simplified modeling of the vocal tract.

it's more subtractive synthesis than additive, i'd say.

the biggest problem/obstacle is controlling the 'movement' of all the parameters, and the timing of these.

i haven't tried 'proper' physically modeling of the human voice - that sounds quite complex, and academic...