Tried to extract my own glottal pulse to make the synth sound more human.
-
Tried to extract my own glottal pulse to make the synth sound more human. Learned my voice is too gentle for radio. Sadness fills my soul. That's probably why I didn't stick with radio shows.
I recorded sustained vowels and used IAIF (Iterative Adaptive Inverse Filtering) to extract my glottal waveform - the raw "buzz" before your throat shapes it into vowels.
What I expected: Rich, characterful human excitation to replace the mathematical model.
What I got: A softer, breathier sound than pure math!
The mathematical LF model with sharpness cranked to 10 actually produces MORE harmonics than my actual voice does. That "chest resonant radio announcer" sound? That's aggressive glottal snap that not everyone has.@Tamasg Ooo, audio please.
-
@Tamasg @BorrisInABox Or it's like what softvoice did, all the different sources. Wait a minute! Is that the problem you're having with a female source? Do you need a real female glottal pulse to start from?
@x0 @BorrisInABox sadly female I realized would require Formant frequency tuning for the phonemes. Right now if we just put a female glottal shape over that, at best it would just sound aliased on top of the deeper, male-characteristic voice. theoretically... a female voice with a sharper glottal closure would actually give us MORE harmonics to work with, not fewer! Would be genuinely interesting to compare though - extract a female glottal pulse and see if the shape is meaningfully different.
-
@mcourcel lol sounds horrible!
-
@Tamasg Hehehehehe lolol, sort of like E-Speak.
-
-
@mcourcel yep. This gave me some real good insight into what Espeak did to fuck up SpeechPlayer, mainly changing its glottal source a lot. Hahahaha good lesson-learning!
-
@BorrisInABox @Tamasg I got A.Liv on the Surge discord to kindly work with my source material and the results are now called Exocat's Metalodon in the 3rd-party wavetables folder of Surge's factory data.
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
-
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
@BorrisInABox @Tamasg SO if you actually wanted to do that for whatever reason, I can send you the wavetables which are already fixed length single-cycle waveforms, unless you already have surge.
-
-
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
@x0 @BorrisInABox @Tamasg Hehehe lololol! The spin sounds cool. Like a light saber.
-
@BorrisInABox @Tamasg SO if you actually wanted to do that for whatever reason, I can send you the wavetables which are already fixed length single-cycle waveforms, unless you already have surge.
@x0 @BorrisInABox lol! Can't even explain what it did, but it definitely introduces a metallic quality unlike any I've heard in speech synthesis before. Not even as tube-like as when I tried mine was, but boy is it bad. That grindyness really shows through.
-
@x0 @BorrisInABox lol! Can't even explain what it did, but it definitely introduces a metallic quality unlike any I've heard in speech synthesis before. Not even as tube-like as when I tried mine was, but boy is it bad. That grindyness really shows through.
@Tamasg @BorrisInABox lmfaoooooo what, that's like the odd source of softvoice
-
@Tamasg @BorrisInABox lmfaoooooo what, that's like the odd source of softvoice
@x0 @BorrisInABox lol this thing is a trip to use. It's, just... So gritty, so metallic, nothin' quite like it. So I'm keeping it at https://eurpod.com/synths/speechPlayer-brokenmachine.dll - though clear proof that with the right matching glottal source it can sound less tubey and more natural, just gotta find the right radio announcer-type glottal source

-
@BorrisInABox Oh cool! For the extraction I recorded 5 sounds:
"ahh" sustained at normal pitch (~5 sec)
2. "ahh" sustained at low pitch (~5 sec)
3. "ahh" sustained at high pitch (~5 sec)
4. "shhh" sustained fricative (~5 sec)
5. "th" sustained unvoiced (~3 sec)
The "ahh" vowels are for glottal pulse extraction at different F0s. The "sh" and "th" are for noise/frication characteristics.
Recording tips:
• Condenser or dynamic mic (I used a Blue Snowball, AT2005 was too noisy)
• Peaks around -5 to -8 dB (NOT quiet - my first attempt at -30 dB was useless)
• Steady volume, no vibrato
• Quiet room
• 44100 Hz, mono
The key is getting a clean, loud, boring sustained vowel - no expression, just pure steady tone. The more monotone the better for extraction!@BorrisInABox Small add-on for the voice recording set: raw audio only, please — no noise suppression, auto gain, compressor/limiter, or EQ. The boring part matters here: keep the vowel steady with no vibrato, because I’m aligning and averaging glottal cycles and pitch wobble makes the final source less crisp. If you can, include ~10 seconds of room tone (silence) in a file, so I can calibrate noise and hum. And when you record “th”, make it the “think” version (/θ/). Optional but very helpful: a sustained “zzzz” (/z/) and “vvvv” (/v/) so I can capture voicing + turbulence together for better “edge” control later. Hope this helps too. LOL if this works out your voice would be forever partially captured into a synth. LOL.
-
@BorrisInABox Small add-on for the voice recording set: raw audio only, please — no noise suppression, auto gain, compressor/limiter, or EQ. The boring part matters here: keep the vowel steady with no vibrato, because I’m aligning and averaging glottal cycles and pitch wobble makes the final source less crisp. If you can, include ~10 seconds of room tone (silence) in a file, so I can calibrate noise and hum. And when you record “th”, make it the “think” version (/θ/). Optional but very helpful: a sustained “zzzz” (/z/) and “vvvv” (/v/) so I can capture voicing + turbulence together for better “edge” control later. Hope this helps too. LOL if this works out your voice would be forever partially captured into a synth. LOL.
@Tamasg @BorrisInABox Ooo, a Boris voice synth coming soon!
-
@Tamasg @BorrisInABox Ooo, a Boris voice synth coming soon!
-
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
@x0 LMAO you've got a dubstep washer! @BorrisInABox @Tamasg
-
@x0 LMAO you've got a dubstep washer! @BorrisInABox @Tamasg
@Scott @BorrisInABox @Tamasg Yup, as soon as I heard that I thought of some Skrillex shit and had to get it put into a synth. It was recorded in 2019, and in 2022 it finally happened. This is the demo that A.Liv made with it, everything except the supersaw and drums are the resulting tables.
-
@Scott @BorrisInABox @Tamasg Yup, as soon as I heard that I thought of some Skrillex shit and had to get it put into a synth. It was recorded in 2019, and in 2022 it finally happened. This is the demo that A.Liv made with it, everything except the supersaw and drums are the resulting tables.
@x0 @Scott @BorrisInABox ah no way that's really cool! You can totally hear the samples in there

-
@x0 @Scott @BorrisInABox ah no way that's really cool! You can totally hear the samples in there

@Tamasg @Scott @BorrisInABox Now feed that into a vocoder and have this gigantic radio voice going "search and destroy" and then a killer dubstep drop, it would be perfect.