Tried to extract my own glottal pulse to make the synth sound more human.
-
@mcourcel yep. This gave me some real good insight into what Espeak did to fuck up SpeechPlayer, mainly changing its glottal source a lot. Hahahaha good lesson-learning!
-
@BorrisInABox @Tamasg I got A.Liv on the Surge discord to kindly work with my source material and the results are now called Exocat's Metalodon in the 3rd-party wavetables folder of Surge's factory data.
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
-
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
@BorrisInABox @Tamasg SO if you actually wanted to do that for whatever reason, I can send you the wavetables which are already fixed length single-cycle waveforms, unless you already have surge.
-
-
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
@x0 @BorrisInABox @Tamasg Hehehe lololol! The spin sounds cool. Like a light saber.
-
@BorrisInABox @Tamasg SO if you actually wanted to do that for whatever reason, I can send you the wavetables which are already fixed length single-cycle waveforms, unless you already have surge.
@x0 @BorrisInABox lol! Can't even explain what it did, but it definitely introduces a metallic quality unlike any I've heard in speech synthesis before. Not even as tube-like as when I tried mine was, but boy is it bad. That grindyness really shows through.
-
@x0 @BorrisInABox lol! Can't even explain what it did, but it definitely introduces a metallic quality unlike any I've heard in speech synthesis before. Not even as tube-like as when I tried mine was, but boy is it bad. That grindyness really shows through.
@Tamasg @BorrisInABox lmfaoooooo what, that's like the odd source of softvoice
-
@Tamasg @BorrisInABox lmfaoooooo what, that's like the odd source of softvoice
@x0 @BorrisInABox lol this thing is a trip to use. It's, just... So gritty, so metallic, nothin' quite like it. So I'm keeping it at https://eurpod.com/synths/speechPlayer-brokenmachine.dll - though clear proof that with the right matching glottal source it can sound less tubey and more natural, just gotta find the right radio announcer-type glottal source

-
@BorrisInABox Oh cool! For the extraction I recorded 5 sounds:
"ahh" sustained at normal pitch (~5 sec)
2. "ahh" sustained at low pitch (~5 sec)
3. "ahh" sustained at high pitch (~5 sec)
4. "shhh" sustained fricative (~5 sec)
5. "th" sustained unvoiced (~3 sec)
The "ahh" vowels are for glottal pulse extraction at different F0s. The "sh" and "th" are for noise/frication characteristics.
Recording tips:
• Condenser or dynamic mic (I used a Blue Snowball, AT2005 was too noisy)
• Peaks around -5 to -8 dB (NOT quiet - my first attempt at -30 dB was useless)
• Steady volume, no vibrato
• Quiet room
• 44100 Hz, mono
The key is getting a clean, loud, boring sustained vowel - no expression, just pure steady tone. The more monotone the better for extraction!@BorrisInABox Small add-on for the voice recording set: raw audio only, please — no noise suppression, auto gain, compressor/limiter, or EQ. The boring part matters here: keep the vowel steady with no vibrato, because I’m aligning and averaging glottal cycles and pitch wobble makes the final source less crisp. If you can, include ~10 seconds of room tone (silence) in a file, so I can calibrate noise and hum. And when you record “th”, make it the “think” version (/θ/). Optional but very helpful: a sustained “zzzz” (/z/) and “vvvv” (/v/) so I can capture voicing + turbulence together for better “edge” control later. Hope this helps too. LOL if this works out your voice would be forever partially captured into a synth. LOL.
-
@BorrisInABox Small add-on for the voice recording set: raw audio only, please — no noise suppression, auto gain, compressor/limiter, or EQ. The boring part matters here: keep the vowel steady with no vibrato, because I’m aligning and averaging glottal cycles and pitch wobble makes the final source less crisp. If you can, include ~10 seconds of room tone (silence) in a file, so I can calibrate noise and hum. And when you record “th”, make it the “think” version (/θ/). Optional but very helpful: a sustained “zzzz” (/z/) and “vvvv” (/v/) so I can capture voicing + turbulence together for better “edge” control later. Hope this helps too. LOL if this works out your voice would be forever partially captured into a synth. LOL.
@Tamasg @BorrisInABox Ooo, a Boris voice synth coming soon!
-
@Tamasg @BorrisInABox Ooo, a Boris voice synth coming soon!
-
@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.
@x0 LMAO you've got a dubstep washer! @BorrisInABox @Tamasg
-
@x0 LMAO you've got a dubstep washer! @BorrisInABox @Tamasg
@Scott @BorrisInABox @Tamasg Yup, as soon as I heard that I thought of some Skrillex shit and had to get it put into a synth. It was recorded in 2019, and in 2022 it finally happened. This is the demo that A.Liv made with it, everything except the supersaw and drums are the resulting tables.
-
@Scott @BorrisInABox @Tamasg Yup, as soon as I heard that I thought of some Skrillex shit and had to get it put into a synth. It was recorded in 2019, and in 2022 it finally happened. This is the demo that A.Liv made with it, everything except the supersaw and drums are the resulting tables.
@x0 @Scott @BorrisInABox ah no way that's really cool! You can totally hear the samples in there

-
@x0 @Scott @BorrisInABox ah no way that's really cool! You can totally hear the samples in there

@Tamasg @Scott @BorrisInABox Now feed that into a vocoder and have this gigantic radio voice going "search and destroy" and then a killer dubstep drop, it would be perfect.
-
@BorrisInABox Small add-on for the voice recording set: raw audio only, please — no noise suppression, auto gain, compressor/limiter, or EQ. The boring part matters here: keep the vowel steady with no vibrato, because I’m aligning and averaging glottal cycles and pitch wobble makes the final source less crisp. If you can, include ~10 seconds of room tone (silence) in a file, so I can calibrate noise and hum. And when you record “th”, make it the “think” version (/θ/). Optional but very helpful: a sustained “zzzz” (/z/) and “vvvv” (/v/) so I can capture voicing + turbulence together for better “edge” control later. Hope this helps too. LOL if this works out your voice would be forever partially captured into a synth. LOL.
-
@BorrisInABox Excellent! These are CLEAN recordings! Look at those noise floors! down to -55.9 dB on ah_normal.wav! And the F0 stability is fantastic (±1.5 Hz on the normal pitch).
Key observations:
Recording
F0
Notes
ah_normal.wav
119.4 Hz (±1.5)
Best candidate - great level, super stable, lowest noise floor
ah_normal_take2.wav
115.7 Hz (±1.5)
Also excellent, slightly higher noise floor
ah_lower.wav
89.8 Hz (±8.5)
More pitch variation - less stable
ah_lower_take2.wav
90.6 Hz (±1.7)
Much more stable than take1!
ah_higher.wav
215.3 Hz (±3.1)
Good for testing F0 invariance
ah_higher_take2.wav
224.5 Hz (±3.2)
Hottest levels (-2.6 dB peak)
I'll use ah_normal.wav as the primary source — it has the best combination of:
• Stable F0 (±1.5 Hz)
• Good level (-11.9 dB peak, plenty of headroom)
• Lowest noise floor (-55.9 dB)
• Nice male radio voice F0 (~120 Hz)
Huge thanks for this. We'll see how it goes. -
@BorrisInABox so here's the big difference. Existing: my voice drops from 1.0 to ~0.09 in ~28 samples, but then oscillates
Your pulse is cleaner, single clear cycle without the multi-peak oscillations. This should give more predictable harmonic structure.