I've heard that OpenAI and many AI labs put watermarks [0] in their LLM outputs to detect AI-generated content and filter it out.
[0] Like statistics of words, etc.
Rabbit_Brave · 31m ago
These companies are sitting on a never-ending stream of human created data. What do you think happens to your conversations or other interactions with AI? Quality might be a bit sus though.
phillipcarter · 32s ago
Most of the human-created data is also very low quality. But it's also limited in other ways, such as how a lot of so-called high-quality data online is typically the finished answer to a question, with no serialization of the thought process that lead to that answer.
AstroBen · 12m ago
I'd imagine it's really low quality data. Most or all of my conversations with an LLM are questions or telling it to do something, with varying levels of specificity
I'm not sure what they'd get from training on that
insin · 6m ago
I sometimes wonder if they're vulnerable to a coordinated effort of deliberately upvoting shit assistant turns and praising in the next user turn - how much does that actually contribute to future training, if at all?
abc-1 · 5m ago
They’ve been spouting this for years. I have yet to see an actual practitioner who’s job it is to collect data for these LLMs say it’s actually an issue.
They’ve been spouting it for years because it’s an entertaining thought and scratches an itch to many different reader bases. That doesn’t make it true.
sidibe · 2m ago
Yeah I think it's because people want it to be true that LLMs will stop improving and regress. If nothing else they can always just access data from the before times if it was an actual issue
So much burying the head in the sand from people in this industry and wishful thinking that AI will stop before whatever they're good at. A little reminder a couple years ago most people hadn't even heard of LLMs.
[0] Like statistics of words, etc.
I'm not sure what they'd get from training on that
They’ve been spouting it for years because it’s an entertaining thought and scratches an itch to many different reader bases. That doesn’t make it true.
So much burying the head in the sand from people in this industry and wishful thinking that AI will stop before whatever they're good at. A little reminder a couple years ago most people hadn't even heard of LLMs.