Tried to extract my own glottal pulse to make the synth sound more human.

Joshua

@Tamasg @mcourcel oh god that sounds so bad

Tamas G

@mcourcel yep. This gave me some real good insight into what Espeak did to fuck up SpeechPlayer, mainly changing its glottal source a lot. Hahahaha good lesson-learning!

x0

@BorrisInABox @Tamasg This is the raw source material, which I later trimmed and did some noise reduction on, and then A.Liv carefully turned it into something that was a consistent period to be turned into wavetables, I think at 2048 samples per frame.

x0

@BorrisInABox @Tamasg SO if you actually wanted to do that for whatever reason, I can send you the wavetables which are already fixed length single-cycle waveforms, unless you already have surge.

Alex Chapman

@J3317 @Tamasg @mcourcel Lmfao that should be an extra voice added for the lols

Martin

@x0 @BorrisInABox @Tamasg Hehehe lololol! The spin sounds cool. Like a light saber.

Tamas G

@x0 @BorrisInABox lol! Can't even explain what it did, but it definitely introduces a metallic quality unlike any I've heard in speech synthesis before. Not even as tube-like as when I tried mine was, but boy is it bad. That grindyness really shows through.

x0

@Tamasg @BorrisInABox lmfaoooooo what, that's like the odd source of softvoice

Tamas G

@x0 @BorrisInABox lol this thing is a trip to use. It's, just... So gritty, so metallic, nothin' quite like it. So I'm keeping it at https://eurpod.com/synths/speechPlayer-brokenmachine.dll - though clear proof that with the right matching glottal source it can sound less tubey and more natural, just gotta find the right radio announcer-type glottal source

Tamas G

@BorrisInABox Small add-on for the voice recording set: raw audio only, please — no noise suppression, auto gain, compressor/limiter, or EQ. The boring part matters here: keep the vowel steady with no vibrato, because I’m aligning and averaging glottal cycles and pitch wobble makes the final source less crisp. If you can, include ~10 seconds of room tone (silence) in a file, so I can calibrate noise and hum. And when you record “th”, make it the “think” version (/θ/). Optional but very helpful: a sustained “zzzz” (/z/) and “vvvv” (/v/) so I can capture voicing + turbulence together for better “edge” control later. Hope this helps too. LOL if this works out your voice would be forever partially captured into a synth. LOL.

Martin

@Tamasg @BorrisInABox Ooo, a Boris voice synth coming soon!

Borris

@mcourcel @Tamasg It's fake news.

Scott

@x0 LMAO you've got a dubstep washer! @BorrisInABox @Tamasg

x0

@Scott @BorrisInABox @Tamasg Yup, as soon as I heard that I thought of some Skrillex shit and had to get it put into a synth. It was recorded in 2019, and in 2022 it finally happened. This is the demo that A.Liv made with it, everything except the supersaw and drums are the resulting tables.

Tamas G

@x0 @Scott @BorrisInABox ah no way that's really cool! You can totally hear the samples in there

x0

@Tamasg @Scott @BorrisInABox Now feed that into a vocoder and have this gigantic radio voice going "search and destroy" and then a killer dubstep drop, it would be perfect.

Borris

@Tamasg Have a thing.
https://www.dropbox.com/scl/fi/06xusmq45tjvddimav861/glottles.wav?rlkey=opxqxp3ruhb80qdgwva5eoyzl&dl=1

Tamas G

@BorrisInABox Excellent! These are CLEAN recordings! Look at those noise floors! down to -55.9 dB on ah_normal.wav! And the F0 stability is fantastic (±1.5 Hz on the normal pitch).
Key observations:
Recording
F0
Notes
ah_normal.wav
119.4 Hz (±1.5)
Best candidate - great level, super stable, lowest noise floor
ah_normal_take2.wav
115.7 Hz (±1.5)
Also excellent, slightly higher noise floor
ah_lower.wav
89.8 Hz (±8.5)
More pitch variation - less stable
ah_lower_take2.wav
90.6 Hz (±1.7)
Much more stable than take1!
ah_higher.wav
215.3 Hz (±3.1)
Good for testing F0 invariance
ah_higher_take2.wav
224.5 Hz (±3.2)
Hottest levels (-2.6 dB peak)
I'll use ah_normal.wav as the primary source — it has the best combination of:
• Stable F0 (±1.5 Hz)
• Good level (-11.9 dB peak, plenty of headroom)
• Lowest noise floor (-55.9 dB)
• Nice male radio voice F0 (~120 Hz)
Huge thanks for this. We'll see how it goes.

Tamas G

@BorrisInABox so here's the big difference. Existing: my voice drops from 1.0 to ~0.09 in ~28 samples, but then oscillates
Your pulse is cleaner, single clear cycle without the multi-peak oscillations. This should give more predictable harmonic structure.