Sadness fills my soul.

ROMMIX

@Tamasg Why so quick to give up?

Yadiel Sotomayor

@Tamasg To my ears, and they are not the best ears, its to muddy. I have a hard time teasing words apart. Sound is fine, I think it’s diction. But this is coming from the guy that uses Samantha compact on everything except for work computer where I can't install any speech engines besides eloquence

Tamas G

@rommix0 well, now I think the DSP isn't the issue, it's the phonemes, the passes, DECTalk knows things like "desktop" gets a 60% reduction in fricative burst than the word "start." That's what I'm missing. Coarticulation is great, but it just pulls vowels and consonants more naturally to their targets. These other engines had way more sophisticated rules than just, "same fricative / affricate durrations" passed to the DSP. I've understood a lot more on how pitch controls prosody, but there's this engreediant missing and I don't think I'll ever find it lol

ROMMIX

@Tamasg Have you considered adding formant target readjustment rules to your program? That's something DECTalk did, especially for back vowels after alveolar sounds.

Muchancho del subsuelo

@Tamasg It doesn't need to sound like Eloquence, I really like how is becoming more intelligible with each version. It has the potential to be even better than Espeak! At it's current state is way more pleasant to hear than Espeak for sure, it just need to inprobe pronounciation.

Tamas G

@rommix0 yeah, I think the CMUDict work is leading me towards this. It's shown me that part of my problem is just getting vowel stress and cluster targets from Espeak's IPA rather than actual broken down words made by linguists studying it deeply. So the data file is just all the words, broken down into IPA notation through a Python script into Espeak tie-bars and such. Things like, "'frisco ˈfɹɪskoʊ" - rewriting the rules like that first and then not doing an overlay to "correct" for Espeak's quirks will be where that moves, along which I think can come some more formant target passes. We have the EndCF1-3 and EndPF1-3 wired up per frame now, but obviously wiring it up isn't the same thing as using it right.

ROMMIX

@Tamasg You might want to check this file too. It's got some good stuff on target adjustment.
https://github.com/dectalk/DECtalkMini/blob/dectalk-develop/include/p_us_st0.c

Alex Chapman

@muchanchoasado @Tamasg Exactly what I was thinking, over time this is getting better.

Tamas G

@rommix0 so looks like DECTalk used a 3-layer approach to modifying this. I have Layer 2 well defined, but not the first layer and the third one. Layer 1 is specific, large Hz offsets for known phoneme pairs. Not computed from a formula though but hard-coded. Then Layer 3, the forward and backward rules. Very helpful there to know.

Muchancho del subsuelo

@Tamasg @alexchapman Yeah, it is really exciting to follow the development of a new formant synth after a long time.

ROMMIX

@Tamasg Yeah it's good stuff. It's a good place to start.

Tamas G

@rommix0 ah, now that filename suddenly makes sense from earlier, nice handle change