Inteligência artificial

Amazon treina LLM de 980 milhões de parâmetros com ‘habilidades emergentes’

Researchers at Amazon have trained a new large language model (LLM) for text-to-speech that they claim exhibits “emergent” abilities. 

Amazon treina LLM de 980 milhões de parâmetros com ‘habilidades emergentes’

Researchers at Amazon have trained a new large language model (LLM) for text-to-speech that they claim exhibits “emergent” abilities. 

The 980 million parameter model, called BASE TTS, is the largest text-to-speech model yet created. The researchers trained models of various sizes on up to 100,000 hours of public domain speech data to see if they would observe the same performance leaps that occur in natural language processing models once they grow past a certain scale. 

They found that their medium-sized 400 million parameter model – trained on 10,000 hours of audio – showed a marked improvement in versatility and robustness on tricky test sentences.

The test sentences contained complex lexical, syntactic, and paralinguistic features like compound nouns, emotions, foreign words, and punctuation that normally trip up text-to-speech systems. While BASE TTS did not handle them perfectly, it made significantly fewer errors in stress, intonation, and pronunciation than existing models.

“These sentences are designed to contain challenging tasks—none of which BASE TTS is explicitly trained to perform,” explained the researchers. 

The largest 980 million parameter version of the model – trained on 100,000 hours of audio – did not demonstrate further abilities beyond the 400 million parameter version.

While an experimental process

About Author

4tune

Leave a Reply

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *