Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Darkly)
  • No Skin
Collapse
Brand Logo
  1. Home
  2. Uncategorized
  3. Sadness fills my soul.

Sadness fills my soul.

Scheduled Pinned Locked Moved Uncategorized
21 Posts 8 Posters 2 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • T Tamas G

    @FreakyFwoof lol but Eloquence is the gold standard, the one that so many can listen to without their ears going into exhaustion. I think in the community probably DECTalk is the second close, I usually find people in either group, and then the more nitche groups who like harsher things like Espeak or the hybrid formant-concatenated stuff. SpeechBox for many just falls too below DecTalk, above Espeak, but comfortably so that it can't become their daily driver, because RHVoice or the other options are "good enough" for their ears that formant stuff is too robotic. Fair point in that way.

    Andre LouisF This user is from outside of this forum
    Andre LouisF This user is from outside of this forum
    Andre Louis
    wrote last edited by
    #5

    @Tamasg It's the opposite of my gold standard because I didn't grow up hearing it. I have NV Speech Player, but not the one you're talking about, unless it's the same thing? Always happy to have options.

    T 1 Reply Last reply
    0
    • T Tamas G

      Sadness fills my soul. SpeechBox will never gonna sound like Eloquence. Might as well abandon it. People still find it too sharp and sybilant. Better synths will come along, perhaps something neural anway. It's just a time waste in the end. Sadness.

      Matt CampbellM This user is from outside of this forum
      Matt CampbellM This user is from outside of this forum
      Matt Campbell
      wrote last edited by
      #6

      @Tamasg It's still worthwhile to implement a synthesizer that runs with the same efficiency that Eloquence had. There are some appliances, medical devices, etc. that implement a GUI on a microcontroller. If sighted people can have a UI on a device with limited computing power, then we should be able to have a speech interface on that same class of device. That's why I don't like the idea of giving up on formant TTS and resigning ourselves to neural TTS being *the* future.

      Matt CampbellM 1 Reply Last reply
      0
      • Andre LouisF Andre Louis

        @Tamasg It's the opposite of my gold standard because I didn't grow up hearing it. I have NV Speech Player, but not the one you're talking about, unless it's the same thing? Always happy to have options.

        T This user is from outside of this forum
        T This user is from outside of this forum
        Tamas G
        wrote last edited by
        #7

        @FreakyFwoof ha, SpeechBox (mostly) sounds the same, I've reduced a lot of the clickyness the old Sawtooth engine had especially on the "D" phoneme and "T" endings. Very clicky. I am glad at least that I could accomplish that in the Speechbox version, but still couldn't move the needle on it sounding "more like Eloquence from the 90s than Eloquence from the 2000s" as someone commented to me the other day.

        Andre LouisF 1 Reply Last reply
        0
        • Matt CampbellM Matt Campbell

          @Tamasg It's still worthwhile to implement a synthesizer that runs with the same efficiency that Eloquence had. There are some appliances, medical devices, etc. that implement a GUI on a microcontroller. If sighted people can have a UI on a device with limited computing power, then we should be able to have a speech interface on that same class of device. That's why I don't like the idea of giving up on formant TTS and resigning ourselves to neural TTS being *the* future.

          Matt CampbellM This user is from outside of this forum
          Matt CampbellM This user is from outside of this forum
          Matt Campbell
          wrote last edited by
          #8

          @Tamasg Of course, it's up to you to decide whether you still find the project worthwhile.

          1 Reply Last reply
          0
          • T Tamas G

            @FreakyFwoof ha, SpeechBox (mostly) sounds the same, I've reduced a lot of the clickyness the old Sawtooth engine had especially on the "D" phoneme and "T" endings. Very clicky. I am glad at least that I could accomplish that in the Speechbox version, but still couldn't move the needle on it sounding "more like Eloquence from the 90s than Eloquence from the 2000s" as someone commented to me the other day.

            Andre LouisF This user is from outside of this forum
            Andre LouisF This user is from outside of this forum
            Andre Louis
            wrote last edited by
            #9

            @Tamasg Good. A new generation of blind kids can grow up with a new sound. Perfectly acceptable thing to do. Moving the needle forward not backward is very much acceptable.

            1 Reply Last reply
            0
            • T Tamas G

              Sadness fills my soul. SpeechBox will never gonna sound like Eloquence. Might as well abandon it. People still find it too sharp and sybilant. Better synths will come along, perhaps something neural anway. It's just a time waste in the end. Sadness.

              ROMMIXR This user is from outside of this forum
              ROMMIXR This user is from outside of this forum
              ROMMIX
              wrote last edited by
              #10

              @Tamasg Why so quick to give up?

              T 1 Reply Last reply
              0
              • T Tamas G

                Sadness fills my soul. SpeechBox will never gonna sound like Eloquence. Might as well abandon it. People still find it too sharp and sybilant. Better synths will come along, perhaps something neural anway. It's just a time waste in the end. Sadness.

                Y This user is from outside of this forum
                Y This user is from outside of this forum
                Yadiel Sotomayor
                wrote last edited by
                #11

                @Tamasg To my ears, and they are not the best ears, its to muddy. I have a hard time teasing words apart. Sound is fine, I think it’s diction. But this is coming from the guy that uses Samantha compact on everything except for work computer where I can't install any speech engines besides eloquence

                1 Reply Last reply
                0
                • ROMMIXR ROMMIX

                  @Tamasg Why so quick to give up?

                  T This user is from outside of this forum
                  T This user is from outside of this forum
                  Tamas G
                  wrote last edited by
                  #12

                  @rommix0 well, now I think the DSP isn't the issue, it's the phonemes, the passes, DECTalk knows things like "desktop" gets a 60% reduction in fricative burst than the word "start." That's what I'm missing. Coarticulation is great, but it just pulls vowels and consonants more naturally to their targets. These other engines had way more sophisticated rules than just, "same fricative / affricate durrations" passed to the DSP. I've understood a lot more on how pitch controls prosody, but there's this engreediant missing and I don't think I'll ever find it lol

                  ROMMIXR 1 Reply Last reply
                  0
                  • T Tamas G

                    @rommix0 well, now I think the DSP isn't the issue, it's the phonemes, the passes, DECTalk knows things like "desktop" gets a 60% reduction in fricative burst than the word "start." That's what I'm missing. Coarticulation is great, but it just pulls vowels and consonants more naturally to their targets. These other engines had way more sophisticated rules than just, "same fricative / affricate durrations" passed to the DSP. I've understood a lot more on how pitch controls prosody, but there's this engreediant missing and I don't think I'll ever find it lol

                    ROMMIXR This user is from outside of this forum
                    ROMMIXR This user is from outside of this forum
                    ROMMIX
                    wrote last edited by
                    #13

                    @Tamasg Have you considered adding formant target readjustment rules to your program? That's something DECTalk did, especially for back vowels after alveolar sounds.

                    T 1 Reply Last reply
                    0
                    • T Tamas G

                      Sadness fills my soul. SpeechBox will never gonna sound like Eloquence. Might as well abandon it. People still find it too sharp and sybilant. Better synths will come along, perhaps something neural anway. It's just a time waste in the end. Sadness.

                      Muchancho del subsueloM This user is from outside of this forum
                      Muchancho del subsueloM This user is from outside of this forum
                      Muchancho del subsuelo
                      wrote last edited by
                      #14

                      @Tamasg It doesn't need to sound like Eloquence, I really like how is becoming more intelligible with each version. It has the potential to be even better than Espeak! At it's current state is way more pleasant to hear than Espeak for sure, it just need to inprobe pronounciation.

                      Alex ChapmanA 1 Reply Last reply
                      0
                      • ROMMIXR ROMMIX

                        @Tamasg Have you considered adding formant target readjustment rules to your program? That's something DECTalk did, especially for back vowels after alveolar sounds.

                        T This user is from outside of this forum
                        T This user is from outside of this forum
                        Tamas G
                        wrote last edited by
                        #15

                        @rommix0 yeah, I think the CMUDict work is leading me towards this. It's shown me that part of my problem is just getting vowel stress and cluster targets from Espeak's IPA rather than actual broken down words made by linguists studying it deeply. So the data file is just all the words, broken down into IPA notation through a Python script into Espeak tie-bars and such. Things like, "'frisco ˈfɹɪskoʊ" - rewriting the rules like that first and then not doing an overlay to "correct" for Espeak's quirks will be where that moves, along which I think can come some more formant target passes. We have the EndCF1-3 and EndPF1-3 wired up per frame now, but obviously wiring it up isn't the same thing as using it right.

                        ROMMIXR 1 Reply Last reply
                        0
                        • T Tamas G

                          @rommix0 yeah, I think the CMUDict work is leading me towards this. It's shown me that part of my problem is just getting vowel stress and cluster targets from Espeak's IPA rather than actual broken down words made by linguists studying it deeply. So the data file is just all the words, broken down into IPA notation through a Python script into Espeak tie-bars and such. Things like, "'frisco ˈfɹɪskoʊ" - rewriting the rules like that first and then not doing an overlay to "correct" for Espeak's quirks will be where that moves, along which I think can come some more formant target passes. We have the EndCF1-3 and EndPF1-3 wired up per frame now, but obviously wiring it up isn't the same thing as using it right.

                          ROMMIXR This user is from outside of this forum
                          ROMMIXR This user is from outside of this forum
                          ROMMIX
                          wrote last edited by
                          #16

                          @Tamasg You might want to check this file too. It's got some good stuff on target adjustment.
                          https://github.com/dectalk/DECtalkMini/blob/dectalk-develop/include/p_us_st0.c

                          T 1 Reply Last reply
                          0
                          • Muchancho del subsueloM Muchancho del subsuelo

                            @Tamasg It doesn't need to sound like Eloquence, I really like how is becoming more intelligible with each version. It has the potential to be even better than Espeak! At it's current state is way more pleasant to hear than Espeak for sure, it just need to inprobe pronounciation.

                            Alex ChapmanA This user is from outside of this forum
                            Alex ChapmanA This user is from outside of this forum
                            Alex Chapman
                            wrote last edited by
                            #17

                            @muchanchoasado @Tamasg Exactly what I was thinking, over time this is getting better.

                            Muchancho del subsueloM 1 Reply Last reply
                            0
                            • ROMMIXR ROMMIX

                              @Tamasg You might want to check this file too. It's got some good stuff on target adjustment.
                              https://github.com/dectalk/DECtalkMini/blob/dectalk-develop/include/p_us_st0.c

                              T This user is from outside of this forum
                              T This user is from outside of this forum
                              Tamas G
                              wrote last edited by
                              #18

                              @rommix0 so looks like DECTalk used a 3-layer approach to modifying this. I have Layer 2 well defined, but not the first layer and the third one. Layer 1 is specific, large Hz offsets for known phoneme pairs. Not computed from a formula though but hard-coded. Then Layer 3, the forward and backward rules. Very helpful there to know.

                              ROMMIXR 1 Reply Last reply
                              0
                              • Alex ChapmanA Alex Chapman

                                @muchanchoasado @Tamasg Exactly what I was thinking, over time this is getting better.

                                Muchancho del subsueloM This user is from outside of this forum
                                Muchancho del subsueloM This user is from outside of this forum
                                Muchancho del subsuelo
                                wrote last edited by
                                #19

                                @Tamasg @alexchapman Yeah, it is really exciting to follow the development of a new formant synth after a long time.

                                1 Reply Last reply
                                0
                                • T Tamas G

                                  @rommix0 so looks like DECTalk used a 3-layer approach to modifying this. I have Layer 2 well defined, but not the first layer and the third one. Layer 1 is specific, large Hz offsets for known phoneme pairs. Not computed from a formula though but hard-coded. Then Layer 3, the forward and backward rules. Very helpful there to know.

                                  ROMMIXR This user is from outside of this forum
                                  ROMMIXR This user is from outside of this forum
                                  ROMMIX
                                  wrote last edited by
                                  #20

                                  @Tamasg Yeah it's good stuff. It's a good place to start.

                                  T 1 Reply Last reply
                                  0
                                  • ROMMIXR ROMMIX

                                    @Tamasg Yeah it's good stuff. It's a good place to start.

                                    T This user is from outside of this forum
                                    T This user is from outside of this forum
                                    Tamas G
                                    wrote last edited by
                                    #21

                                    @rommix0 ah, now that filename suddenly makes sense from earlier, nice handle change 😄

                                    1 Reply Last reply
                                    0
                                    Reply
                                    • Reply as topic
                                    Log in to reply
                                    • Oldest to Newest
                                    • Newest to Oldest
                                    • Most Votes


                                    • Login

                                    • Don't have an account? Register

                                    • Login or register to search.
                                    Powered by NodeBB Contributors
                                    • First post
                                      Last post
                                    0
                                    • Categories
                                    • Recent
                                    • Tags
                                    • Popular
                                    • World
                                    • Users
                                    • Groups