osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

auto-correct a speech-to-text output and relate to of the words based on syllables


On Fri, 02 Feb 2018 08:14:03 +0100, dieter wrote:

>> The user speaks "Light". The system translates it as "Bright" The user
>> speaks "White" The system translates it as "Bright"
> 
> As those words are phonetically quite apart (they have very different
> first consonants), some step in your processing chain does something
> seriously wrong.

I disagree: Light, Bright and White sound very similar. They're identical 
except for the first consonant:

/la?t/
/b?a?t/
/wa?t/

and even those consonants sound very similar. Human beings can easily 
mishear or fail to distinguish between those words, e.g.:

https://www.wordnik.com/words/we%20tripped%20a%20light%20fan%20dangle

https://duckduckgo.com/?q=%22brighter+shade+of+pale%22+mondegreen


(the name of the song is *Whiter* Shade of Pale, not "Lighter" or 
"Brighter"). We should not assume that the first consonant is always 
correct.

Of course we would hope that a speech-to-text system would correctly 
match Light/Bright/White/Fright/etc but given the vagaries of human 
accents and pronunciation, we shouldn't be surprised if it sometimes gets 
them wrong.



-- 
Steve