I applied Chatterbox TTS's open-source voice cloning in a loop!
At iteration i, Chatterbox attempts to mimic the vocal style of `output[i-1]` by copying the content, rhythm, and prosody of `output[i-2]`.
It takes around ten minutes to become quite bad, and it gets unrecognizable within about 20. By the middle, it starts sounding like some new human language, but becomes completely unrecognizable glossolalia by the end.
At iteration i, Chatterbox attempts to mimic the vocal style of `output[i-1]` by copying the content, rhythm, and prosody of `output[i-2]`.
It takes around ten minutes to become quite bad, and it gets unrecognizable within about 20. By the middle, it starts sounding like some new human language, but becomes completely unrecognizable glossolalia by the end.