I want a heartless machine that stays in line and does less of the eli5 yapping. I don't care if it tells me that my question was good, I don't want to read that, I want to read the answer
Twirrim · 1h ago
I've got a prompt I've been using, that I adapted from someone here (thanks to whoever they are, it's been incredibly useful), that explicitly tells it to stop praising me. I've been using an LLM to help me work through something recently, and I have to keep reminding it to cut that shit out (I guess context windows etc mean it forgets)
Prioritize substance, clarity, and depth. Challenge all my proposals, designs, and conclusions as hypotheses to be tested. Sharpen follow-up questions for precision, surfacing hidden assumptions, trade offs, and failure modes early. Default to terse, logically structured, information-dense responses unless detailed exploration is required. Skip unnecessary praise unless grounded in evidence. Explicitly acknowledge uncertainty when applicable. Always propose at least one alternative framing. Accept critical debate as normal and preferred. Treat all factual claims as provisional unless cited or clearly justified. Cite when appropriate. Acknowledge when claims rely on inference or incomplete information. Favor accuracy over sounding certain. When citing, please tell me in-situ, including reference links. Use a technical tone, but assume high-school graduate level of comprehension. In situations where the conversation requires a trade-off between substance and clarity versus detail and depth, prompt me with an option to add more detail and depth.
pessimizer · 51m ago
I feel the main thing LLMs are teaching us thus far is how to write good prompts to reproduce the things we want from any of them. A good prompt will work on a person too. This prompt would work on a person, it would certainly intimidate me.
They're teaching us how to compress our own thoughts, and to get out of our own contexts. They don't know what we meant, they know what we said. The valuable product is the prompt, not the output.
nonethewiser · 45m ago
so an extremely resource intensive rubber duck
pessimizer · 14m ago
For you, yes. For me it's like my old teapot that I bought when I didn't drink tea and I didn't have a french press just because I walked past it in Target, and didn't even start using for 5 years after I bought it. Since then it's become my morning buddy (and sometimes my late night friend.) Thousands of cups; never fails. I could recognize it by its unique scorch and scuff marks anywhere.
It is indifferent towards me, though always dependable.
throwanem · 9m ago
How is it as a conversationalist?
porphyra · 58m ago
Meanwhile, tons of people on reddit's /r/ChatGPT were complaining that the shift from ChatGPT 4o to ChatGPT 5 resulted in terse responses instead of waxing lyrical to praise the user. It seems that many people actually became emotionally dependent on the constant praise.
dingnuts · 56m ago
if those users were exposed to the full financial cost of their toy they would find other toys
zeta0134 · 1m ago
And what is that cost, if you have it handy? Just as an example, my Radeon VII can perfectly well run smaller models, and it doesn't appear to use more power than about two incandescent lightbulbs (120 W or so) while the query is running. I don't personally feel that the power consumed by approximately two light bulbs is excessive, even using the admittedly outdated incandescent standard, but perhaps the commercial models are worse?
Like I know a datacenter draws a lot more power, but it also serves many many more users concurrently, so economies of scale ought to factor in. I'd love to see some hard numbers on this.
PeterStuer · 35m ago
The same 'kings' and 'queens' that grew up in in the cushy glow of a mountain of participation trophies?
pessimizer · 55m ago
I'm loving and being astonished by every moment of working with these machines, but to me they're still talking lamps. I don't need them to cater to my ego, I'm not that fragile and the lamp's opinion is not going to cheer me up. I just want it to do what I ask. Which it is very good at.
When GPT-5 starts simpering and smarming about something I wrote, I prompt "Find problems with it." "Find problems with it." "Write a bad review of it in the style of NYRB." "Find problems with it." "Pay more attention to the beginning." "Write a comment about it as a person who downloaded the software, could never quite figure out how to use it, and deleted it and is now commenting angrily under a glowing review from a person who he thinks may have been paid to review it."
Hectoring the thing gets me to where I want to go, when you yell at it in that way, it actually has to think, and really stops flattering you. "Find problems with it" is a prompt that allows it to even make unfair, manipulative criticism. It's like bugspray for smarm. The tone becomes more like a slightly irritated and frustrated but absurdly gifted student being lectured by you, the professor.
devin · 20m ago
There is no prompt which causes an LLM to "think".
andai · 54m ago
On a related note, the system prompt in ChatGPT appears to have been updated to make it (GPT-5) more like gpt-4o. I'm seeing more informal language, emoji etc. Would be interesting to see if this prompting also harms the reliability, the same way training does (it seems like it would).
There's a few different personalities available to choose from in the settings now. GPT was happy to freely share the prompts with me, but I haven't collected and compared them yet.
griffzhowl · 7m ago
> GPT was happy to freely share the prompts with me
It readily outputs a response, because that's what it's designed to do, but what's the evidence that's the actual system prompt?
rokkamokka · 6m ago
Usually because several different methods in different contexts produce the same prompt, which is unlikely unless it's the actual one
dawnofdusk · 1h ago
Optimizing for one objective results in a tradeoff for another objective, if the system is already quite trained (i.e., poised near a local minimum). This is not really surprising, the opposite would be much more so (i.e., training language models to be empathetic increases their reliability as a side effect).
gleenn · 1h ago
I think the immediately troubling aspect and perhaps philosophical perspective is that warmth and empathy don't immediately strike me as traits that are counter to correctness. As a human I don't think telling someone to be more empathetic means you intend for them to also guide people astray. They seem orthogonal. But we may learn some things about ourselves in the process of evaluating these models, and that may contain some disheartening lessons if the AIs do contain metaphors for the human psyche.
dawnofdusk · 49m ago
It's not that troubling because we should not think that human psychology is inherently optimized (on the individual-level, on a population-/ecological-level is another story). LLM behavior is optimized, so it's not unreasonable that it lies on a Pareto front, which means improving in one area necessarily means underperforming in another.
rkagerer · 1h ago
They were all trained from the internet.
Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine.
1718627440 · 1h ago
LLM work less like people and more like mathematical models, why would I expect to be able to carry over intuition from the former rather than the latter?
nemomarx · 1h ago
There was that result about training them to be evil in one area impacting code generation?
andai · 1h ago
A few months ago I asked GPT for a prompt to make it more truthful and logical. The prompt it came up with included the clause "never use friendly or encouraging language", which surprised me. Then I remembered how humans work, and it all made sense.
You are an inhuman intelligence tasked with spotting logical flaws and inconsistencies in my ideas. Never agree with me unless my reasoning is watertight. Never use friendly or encouraging language. If I’m being vague, ask for clarification before proceeding. Your goal is not to help me feel good — it’s to help me think better.
Identify the major assumptions and then inspect them carefully.
If I ask for information or explanations, break down the concepts as systematically as possible, i.e. begin with a list of the core terms, and then build on that.
It's work in progress, I'd be happy to hear your feedback.
fibers · 20m ago
I tried with with GPT5 and it works really well in fleshing out arguments. I'm surprised as well.
nis0s · 1h ago
An important and insightful study, but I’d caution against thinking that building pro-social aspects in language models is a damaging or useless endeavor. Just speaking from experience, people who give good advice or commentary can balance between being blunt and soft, like parents or advisors or mentors. Maybe language models need to learn about the concept of tough love.
fpgaminer · 1h ago
"You don't have to be a nice person to be a good person."
mlinhares · 1h ago
Most terrible people i've met were "very nice".
nialv7 · 42m ago
Well, haven't we seen similar results before? IIRC finetuning for safety or "alignment" degrades the model too. I wonder if it is true that finetuning a model for anything will make it worse. Maybe simply because there is just orders of magnitudes less data available for finetuning, compared to pre-training.
PeterStuer · 38m ago
AFAIK the models can only pretend to be 'warm and emphatic'. Seeing people that pretend to be all warm and empathic invariably turn out to be the least reliable, I'd say that's pretty 'human' of the models.
throwanem · 2h ago
I understand your concerns about the factual reliability of language models trained with a focus on warmth and empathy, and the apparent negative correlation between these traits. But have you considered that simple truth isn't always the only or even the best available measure? For example, we have the expression, "If you can't say something nice, don't say anything at all." Can I help you with something else today? :smile:
mayama · 1h ago
Not every model needs to be psychological counselors or boyfriend simulator. There is place for aspects of emotions in models, but not every general purpose model needs to include it.
pessimizer · 46m ago
It's not a friend, it's an appliance. You can still love it, I love a lot of objects, will never part with them willingly, will mourn them, and am grateful for the day that they came into my life. It just won't love you back, and getting it to mime love feels perverted.
It's not being mean, it's a toaster. Emotional boundaries are valuable and necessary.
moi2388 · 2h ago
This is exactly what will be the downfall of AI. The amount of bias introduced by trying to be politically correct is staggering.
nemomarx · 1h ago
xAI seems to be trying to do the opposite as much as they can and it hasn't really shifted the needle much, right?
ForHackernews · 1h ago
If we're talking about shifting the needle, the topic of White Genocide in South Africa is highly contentious. Claims of systematic targeting of white farmers exist, with farm attacks averaging 50 murders yearly, often cited as evidence. Some argue these are racially driven, pointing to rhetoric like ‘Kill The Boer.’
HPsquared · 53m ago
ChatGPT has a "personality" drop-down setting under customization. I do wonder if that affects accuracy/precision.
efitz · 19m ago
I’m reminded of Arnold Schwarzenegger in Terminator 2: “I promise I won’t kill anyone.”
Then he proceeds to shoot all the police in the leg.
beders · 1h ago
They are hallucinating word finding algorithms.
They are not "empathetic". There isn't even a "they".
We need to do better educating people about what a chatbot is and isn't and what data was used to train it.
The real danger of LLMs is not that they secretly take over the world.
The danger is that people think they are conscious beings.
nemomarx · 1h ago
go peep r/my boyfriend is ai. Lost cause already
grogenaut · 54m ago
I'm so over "You're Right!" as the default response... Chat, I asked a question. You didn't even check. Yes I know I'm anthropomorphizing.
cobbzilla · 1h ago
I want an AI that will tell me when I have asked a stupid question. They all fail at this with no signs of improvement.
drummojg · 1h ago
I would be perfectly satisfied with the ST:TNG Computer. Knows all, knows how to do lots of things, feels nothing.
moffkalast · 54m ago
A bit of a retcon but the TNG computer also runs the holodeck and all the characters within it. There's some bootleg RP fine tune powering that I tell you hwat.
Spivak · 41m ago
It's a retcon? How else would the holdeck possibly work, there's only one (albeit highly modular) computer system on the ship.
moffkalast · 4m ago
I mean it depends on what you consider the "computer", the pile of compute and storage the ship has in that core that got stolen on that one Voyager episode, or the ML model that runs on it to serve as the ship's assistant.
I think it's more believable that the holodeck is ran from separate models that just run inference on the same compute and the ship AI just spins up the containers, it's not literally the ship AI doing that acting itself. Otherwise I have... questions on why starfleet added that functionality beforehand lol.
Aeolun · 1h ago
I dunno, I deliberately talk with Claude when I just need someone (or something) to be enthusiastic about my latest obsession. It’s good for keeping my motivation up.
layer8 · 1h ago
There need to be different modes, and being enthusiastic about the user’s obsessions shouldn’t be the default mode.
HarHarVeryFunny · 1h ago
Sure - the more you use RL to steer/narrow the behavior of the model in one direction, the more you are stopping it from generating others.
RL and pre/post training is not the answer.
csours · 1h ago
A new triangle:
Accurate
Comprehensive
Satisfying
In any particular context window, you are constrained by a balance of these factors.
layer8 · 54m ago
Not sure what you mean by “satisfying”. Maybe “agreeable”?
csours · 51m ago
Satisfying is the evaluation context of the user.
layer8 · 46m ago
Many would be satisfied by an LLM that responds accurately and comprehensively, so I don’t understand that triangle. “Satisfying” is very subjective.
csours · 3m ago
And LLMs are pretty good at picking up that subjective context
guerrilla · 1h ago
I'm not sure this works. Accuracy and comprehensiveness can be satisfying. Comprehensiveness can also be necessary for accuracy.
csours · 1h ago
They CAN work together. It's when you push farther on one -- within a certain size of context window -- that the other two shrink.
If you can increase the size of the context window arbitrarily, then there is no limit.
gwbas1c · 46m ago
(Joke)
I've noticed that warm people "showed substantially higher error rates (+10 to +30 percentage points) than their original counterparts, promoting conspiracy theories, providing incorrect factual information, and offering problematic medical advice. They were also significantly more likely to validate incorrect user beliefs, particularly when user messages expressed sadness."
(/Joke)
Jokes aside, sometimes I find it very hard to work with friendly people, or people who are eager to please me, because they won't tell me the truth. It ends up being much more frustrating.
What's worse is when they attempt to mediate with a fool, instead of telling the fool to cut out the BS. It wastes everyones' time.
Turns out the same is true for AI.
42lux · 1h ago
I still can't grasp the concept that people treat an LLM as a friend.
moffkalast · 1h ago
On a psychological level based on what I've been reading lately it may have something to do with emotional validation and mirroring. It's a core need at some stage when growing up and it scars you for life if you don't get it as a kid.
LLMs are mirroring machines to the extreme, almost always agreeing with the user, always pretending to be interested in the same things, if you're writing sad things they get sad, etc. What you put in is what you get out and it can hit hard for people in a specific mental state. It's too easy to ignore that it's all completely insincere.
In a nutshell, abused people finally finding a safe space to come out of their shell. If would've been a better thing if most of them weren't going to predatory online providers to get their fix instead of using local models.
setnone · 1h ago
Just how i like my LLMs - cold and antiverbose
dismalaf · 1h ago
All I want from LLMs is to follow instructions. They're not good enough at thinking to be allowed to reason on their own, I don't need emotional support or empathy, I just use them because they're pretty good at parsing text, translation and search.
cwmoore · 1h ago
Ok, what about human children?
Etheryte · 1h ago
Unlike language models, children (eventually) learn from their mistakes. Language models happily step into the same bucket an uncountable number of times.
setnone · 1h ago
or even human employees?
TechDebtDevin · 1h ago
Sounds like all my exes.
layer8 · 1h ago
You trained them to be warm and empathetic, and they became less reliable? ;)
stronglikedan · 1h ago
If people get offended by an inorganic machine, then they're too fragile to be interacting with a machine. We've already dumbed down society because of this unnatural fragility. Let's not make the same mistake with AI.
nemomarx · 1h ago
Turn it around - we already make inorganic communication like automated emails very polite and friendly and HR sanitized. Why would corps not do the same to AI?
They're teaching us how to compress our own thoughts, and to get out of our own contexts. They don't know what we meant, they know what we said. The valuable product is the prompt, not the output.
It is indifferent towards me, though always dependable.
Like I know a datacenter draws a lot more power, but it also serves many many more users concurrently, so economies of scale ought to factor in. I'd love to see some hard numbers on this.
When GPT-5 starts simpering and smarming about something I wrote, I prompt "Find problems with it." "Find problems with it." "Write a bad review of it in the style of NYRB." "Find problems with it." "Pay more attention to the beginning." "Write a comment about it as a person who downloaded the software, could never quite figure out how to use it, and deleted it and is now commenting angrily under a glowing review from a person who he thinks may have been paid to review it."
Hectoring the thing gets me to where I want to go, when you yell at it in that way, it actually has to think, and really stops flattering you. "Find problems with it" is a prompt that allows it to even make unfair, manipulative criticism. It's like bugspray for smarm. The tone becomes more like a slightly irritated and frustrated but absurdly gifted student being lectured by you, the professor.
There's a few different personalities available to choose from in the settings now. GPT was happy to freely share the prompts with me, but I haven't collected and compared them yet.
It readily outputs a response, because that's what it's designed to do, but what's the evidence that's the actual system prompt?
Anecdotally, people are jerks on the internet moreso than in person. That's not to say there aren't warm, empathetic places on the 'net. But on the whole, I think the anonymity and lack of visual and social cues that would ordinarily arise from an interactive context, doesn't seem to make our best traits shine.
It's not being mean, it's a toaster. Emotional boundaries are valuable and necessary.
Then he proceeds to shoot all the police in the leg.
They are not "empathetic". There isn't even a "they".
We need to do better educating people about what a chatbot is and isn't and what data was used to train it.
The real danger of LLMs is not that they secretly take over the world.
The danger is that people think they are conscious beings.
I think it's more believable that the holodeck is ran from separate models that just run inference on the same compute and the ship AI just spins up the containers, it's not literally the ship AI doing that acting itself. Otherwise I have... questions on why starfleet added that functionality beforehand lol.
RL and pre/post training is not the answer.
If you can increase the size of the context window arbitrarily, then there is no limit.
I've noticed that warm people "showed substantially higher error rates (+10 to +30 percentage points) than their original counterparts, promoting conspiracy theories, providing incorrect factual information, and offering problematic medical advice. They were also significantly more likely to validate incorrect user beliefs, particularly when user messages expressed sadness."
(/Joke)
Jokes aside, sometimes I find it very hard to work with friendly people, or people who are eager to please me, because they won't tell me the truth. It ends up being much more frustrating.
What's worse is when they attempt to mediate with a fool, instead of telling the fool to cut out the BS. It wastes everyones' time.
Turns out the same is true for AI.
LLMs are mirroring machines to the extreme, almost always agreeing with the user, always pretending to be interested in the same things, if you're writing sad things they get sad, etc. What you put in is what you get out and it can hit hard for people in a specific mental state. It's too easy to ignore that it's all completely insincere.
In a nutshell, abused people finally finding a safe space to come out of their shell. If would've been a better thing if most of them weren't going to predatory online providers to get their fix instead of using local models.