'It cannot provide nuance': UK experts warn AI therapy chatbots are not safe

94 distalx 102 5/10/2025, 3:35:24 PM theguardian.com ↗

Comments (102)

sheepscreek · 47m ago
That’s fair but there’s another nuance that they can’t solve for. Cost and availability.

AI is not a substitute for traditional therapy, but it offers an 80% benefit at a fraction of the cost. It could be used to supplement therapy, for the periods between sessions.

The biggest risk is with privacy. Meta could not be trusted knowing what you’re going to wear or eat. Now imagine them knowing your deepest darkest secrets. The advertising business model does not gel well with providing mental health support. Subscription (with privacy guarantees) is the way to go.

sarchertech · 27m ago
Does it offer 80% of the benefit? An AI could match what a human therapist would say 80% (or 99%) of the time and still provide negative benefit.

Therapy seems like the last place an LLM would be beneficial because it’s very hard to keep an LLM from telling you what you want to hear. I can see anyway you could guarantee that a chatbot cause severe damage to a vulnerable patient by supporting their neurosis.

We’re not anywhere close to an LLM which is trained to be supportive and understanding in tone but will never affirm your irrational fears, insecurities, and delusions.

pitched · 17m ago
Sometimes, the process of gathering our thoughts enough to article them into a prompt is where the benefit is. AI as the rubber duck has a lot of value. Understanding that this is what’s needed vs. something deeper, is beyond the scope of what AI can handle.
sarchertech · 11m ago
And that’s fine as long as the person using it has a sophisticated understanding of the technology and a company isn’t selling it as a “therapist”.

When an AI therapist from a health startup confirms that a mentally disturbed person is indeed hearing voices from God, or an insecure teenager uses meta AI as a therapist because Mark Zuckerberg said they should and it agrees with them that yes they are unloveable, then we have a problem.

rsynnott · 27m ago
> AI is not a substitute for traditional therapy, but it offers an 80% benefit at a fraction of the cost.

That... seems optimistic. See, for instance, https://www.rollingstone.com/culture/culture-features/ai-spi...

No psychologist will attempt to convince you that you are the messiah. In at least some cases, our robot overlords are doing _serious active harm_ which the subject would be unlikely to suffer in their absence. LLM therapists are rather likely to be worse than nothing, particularly given their tendency to be overly agreeable.

lurk2 · 5h ago
I tried Replika years ago after reading a Guardian article about it. The story passed it off as an AI model that had been adapted from one a woman had programmed to remember her deceased friend using text messages he had sent her. It ended up being a gamified version of Smarter Child with a slightly longer memory span (4 messages instead of 2) that constantly harangued the user to divulge preferences that were then no-doubt used for marketing purposes. I thought I must be doing something wrong, because people on the replika subreddit were constantly talking about how their replika agent was developing its own personality (I saw no evidence at any point that it had the capacity to do this).

Almost all of these people were openly in (romantic) love with these agents. This was in 2017 or thereabouts, so only a few years after Spike Jonze’s Her came out.

From what I understand the app is now primarily pornographic (a trajectory that a naiver, younger me never saw coming).

I mostly use Copilot for writing Python scripts, but I have had conversations with it. If the model was running locally on your own machine, I can see how it would be effective for people experiencing some sort of emotional crisis. Anyone using a Meta AI for therapy is going to learn the same hard lesson that the people who trusted 23 and Me are currently learning.

mrbombastic · 4h ago
“I thought I must be doing something wrong, because people on the replika subreddit were constantly talking about how their replika agent was developing its own personality (I saw no evidence at any point that it had the capacity to do this).”

People really like to anthropomorphize any object with even the most basic communication capabilities and most people have no concept of the distance between parroting phrases and a full on human consciousness. In the 90s Furbys were a popular toy that said started off speaking furbish and then eventually spoke some (maybe 20?) human phrases, many people were absolutely convinced you could teach them to talk and learn like a human and that they had essentially bought a very intelligent pet. The NSA even banned them for a time because they thought they were recording and learning from surroundings despite that being completely untrue. Point being this is going to get much worse now that LLMs have gotten a whole lot better at mimicking human conversations and there is incentive for companies to overstate capabilities.

trod1234 · 4h ago
This actually isn't that surprising.

There are psychological blindspots that we all have as human beings, and when stimulus is structured in specific ways people lose their grip on reality, or rather more accurately, people have their grip on objective reality ripped away from them without them realizing it because these things operate on us subliminally (to a lesser or greater degree depending on the individual), and it mostly happens pre-perception with the victim none the wiser. They then effectively become slaves to the loudest monster, which is the AI speaking in their ear more than anyone else, and by extension to the slave master who programmed the AI.

One such blindspot is the consistency blindspot where someone may induce you to say something indicating agreement with something similar first, and then ask the question they really want to ask. Once you say something that's in agreement, and by extension something similar is asked, there is bleedover and you fight your own psychology later if you didn't have defenses to short circuit this fixed action pattern (i.e. and already know), and that's just a surface level blindspot that car salesman use all the time; there are much more subtle ones like distorted reflected appraisal which are used by cults, and nation states for thought reform.

To remain internally consistent, with distorted reflected appraisal, your psychology warps itself, and you as a person unravel. These things have been used in torture, but almost no one today is taught what the elements of torture are so they can recognize it, or know how it works. You would be surprised to find that these things are everywhere today, even in K12 education and that's not an accident.

Everyone has reflected appraisal because this is how we adopt the cultural identity we have as people from our parents while we are children.

All that's needed for torture to break someone down are the elements, structuring, and clustering.

Those elements are isolation, cognitive dissonance, coercion with perceived or real loss, and lack of agency to remove with these you break in a series of steps rational thought receding, involuntary hypnosis, and then psychological break (disassociation or a special semi-lucid psychosis capable of planning); with time and exposure.

Structuring uses diabolical structures to turn the psyche back on itself in a trauma loop, and clustering includes any multiples of these elements or structures within a short time period, as well as events that increase susceptibility such as narco-analysis/synthesis based in dopamine spikes triggered by associative priming (operant conditioning). Drug use makes one more susceptible as they found in the early 30s with barbituates, and its since been improved so you can induce this is in almost anyone with a phone.

No AI will ever be able to create and maintain a consistent reflected appraisal for the people they are interacting with, but because the harmful effects aren't seen immediately, people today have blinded themselves and discount the harms that naturally result. The harms from the unnatural loss of objective reality.

lurk2 · 4h ago
Very interesting. Could you recommend any further reading?
trod1234 · 2h ago
Robert Cialdini is probably the lightest book and covers most of the different blindspots we have, except distorted reflected appraisal in his book on Influence. He provides the principles but leaves most of the structure up to the person's imagination.

The coursework in an introduction to communication class may provide some foundational details (depending on the instructor), Sapir-Whorf has basis in blindspots.

Robert Lifton touches on the detailed case studies of torture from the 1950s (under Mao), in his book "Thought Reform and the Psychology of Totalism", and I've heard in later books he creates a framework that classifies cultures as Protean (self-direction, growth, self-determination/agency), or Totalism (towards control which eventually fails Darwin's fitness).

I haven't actually read his later books yet though his earlier books were quite detailed. I believe the internet archive has a copy of this available for reading as a pdf but be warned this is quite dark.

Joost Meerloo in his, "Rape of the Mind" as an overview touches on how Totalitarianism grows in the setting of WW2 and some Mao, though takes Freudian look at things (dating certain aspects which we know to be untrue now).

From there it branches out depending on your interest. The modern material itself while based on these earlier works often has the origins obscured following a separation of objectionable concerns.

There are congressional reports on COINTELPRO and you may find notice it has modern iterations (touching on protest/activist activity harassment), as well as the history of East German Stasi, and Zersetzung where governments use this to repress the population.

There are aspects in the Octalysis Framework (gamification/game design).

Paulo Freire used some of this material in developing his critical pedagogy which was used in the 70s to replace teaching method from a reduction of first principles (based in rome and the greeks) to what's commonly known as rote-based teaching, and later called "Lying to Children", which takes the reversal of that approach following more closely to gnosticism.

The approach is basically you give a flawed useless model which includes both true and false things. Students learn to competence, then are given a new model that's less flawed, where you have to learn and unlearn things already learned. You never actually unlearn anything and it induces frustration and torture destroying minds in the process. Each step towards gnosis becomes more useful but only the most compliant and blind make it to the end with few exceptions. Structures that burn bridges induce failure in math, and the effect is this acts as a filter to gatekeep the technical fields.

The water pipe analogy of voltage in electronics as an example of the latter instead of the first principled approach using diffusion which is more correct.

Disney and Dreamworks uses distorted reflected appraisal tailored towards destructive interference of identity, which some employees have blown the whistle on (for the latter), aimed at children and sneak things past their adult guardians. There's quite a lot if you look around but its not under any single name but scattered. Hopefully that helps.

The Dreamworks whistleblower interview can be found here: https://www.youtube.com/watch?v=vvNZRUtqqa8

All indexed references of it seem to now have been removed from search. I'm glad now that I kept a reference link in a text file.

Update: Dreamworks isn't Pixar, I misremembered,they are owned by Universal Studios, whereas Disney own's Pixar. Pixar and Disney appear to do the same things.

lurk2 · 2h ago
This is all very interesting. The pedagogy you mentioned tracks with how I can remember a lot of my schooling, but it’s also how I would teach. The pedagogical term is “scaffolding,” I think; you assess the student’s current understanding and then use (necessarily imperfect) metaphors to cement the knowledge. It sounds like you’re pointing to something more nefarious (“Do this because I said so.” - authoritarian parenting rather than authoritative, diplomatic, or permissive parenting styles).

I’m not sure I understand how this relates to gnosticism, however. Are you comparing the “Lying to Children” model to gnostic initiation, and asserting that this model selects for the compliant? What is your proposed alternative here?

Particularly,

> Structures that burn bridges induce failure in math, and the effect is this acts as a filter to gatekeep the technical fields.

Sounds compelling, but it strikes me more as a limitation of demand for good math teachers outstripping their supply. I’ve seen this in English language learning a lot; even if the money was there (and it’s not), there are simply far more people with a desire to learn English than there are people qualified to teach it.

trod1234 · 1h ago
You are right, scaffolding seems like a better descriptor.

> It sounds like you're pointing to something more nefarious.

Well the structure itself is quite nefarious in a way. You have to constantly fight against it to progress and don't really have a choice at the beginning, which often leads to learned helplessness and PTSD in the dropouts. As a teacher you also have to constantly fight against this because any shortfall of effort on your part leaves your students behind in one of those pitfalls, and its largely dependent on the students ability to overcome the torture. You generally aren't given sufficient resources to do this because there's no way out; only through. This is why the structure is nefarious and at the root of the problem.

The unlearning process after learning to competence is imperfect and induces what amounts to self-torture sessions. The imposition of psychological stress (torture) actually lowers the ability for rational thought, and may permanently warp people at vulnerable stages of their lives. Children tend to have a period where they try on various personas after which their identity crystallizes which they carry forward. Adopting learned helplessness at this point makes them a resource drain on everyone. You see these effects in the youth today where they can't even read in many cases.

The sequences in math for example rely on a undisclosed change in grading criteria resulting from this path, a gimmick if you will. There is the sequence, Algebra->Geometry->Trigonometry. Algebra is graded based on correct process, whereas Trig is graded based on correct process and correct answer. When the process differs between classes because the process taught was a flawed version, and you pass Geometry, you can't go back. Its outside the scope of the Trig teacher to reteach two classes prior, and they'll just say: "If you are having trouble with this material you should choose a career that doesn't require this", and leave it up to them. This was actually pushed for adoption by the NEA in the 90s, where they were going to strike if the administration didn't cave.

There are similar structures used in weed-out classes in college as well. Physics used to use a non-standard significant figure calculation when the questions were related by a property of causality (1st answer is used for the 2nd, and the 2nd for 3rd, 2 tests, you can only get 1 question wrong on one test to pass. It must be either of the last two on either test). Using a correct method to reduce propagation of error would cause you to fail, and the right answer was passed around to only the professor's favorites, hence very similar to gnosticism where the only the experts determine who may receive the secret knowledge.

An excellent teacher that constantly bucks the norm will naturally sidestep many of the pitfalls, but an average teacher who is overburdened from lack of resources, and ground down who has sunk to the lowest common denominator of work production won't provide a bridge over the pitfall and these things happen through simple lack of action as a consequence of the adopted structure.

When people speak of nefarious and maliciousness there's often an assumed intent, and in a way negligence can be intent but while some could argue these type of plans conform to this based on things our nation's enemies have said, its probably equally if not more a result of degradation and corruption from within as a result of the flaws inherent in centralized systems.

The history about how this came about is particularly muddied. To give some context, Sputnik in the 1960s shocked the US, and they wrote a blank check for Academia towards more engineers and math alumni. It was a problem you can't fix though using money, and when that was noticed the hiring standards which were quite high in the 1960s, were lowered. Whether the lower standards caused this, or subversives snuck in as an attack on the next generation, no one will know. The effect though is by 1978 there is a marked difference in the academic material published prior and after with lower quality resources being available after which conform to the mentioned flawed pedagogy.

The proposed alternative is to go back to the classical pedagogical approach. Use real systems, teach the process of reducing those systems to first principles (in guided fashion), creating models, and then predicting the future behavior of those systems, identifying the limitations. Some professors still do this, but they are in such a minority that you may only see on or two in a local geography (driving range/county) across all areas of study.

> Sounds compelling but it strikes me more as a limitation of demand for good math teachers.

I've known quite a lot of extremely intelligent people who have been hobbled because they couldn't get through the education, the few that have are often unable to apply the knowledge outside a very limited scope. Its a bit of a chicken egg problem, you need the chicken first.

The hiring standards were never raised back up and remain low, and the materials used to teach those have degraded, there is also no incentive towards improvement of teachers. Basic performance metrics are eschewed from collection. You see this particularly in colleges where they may collect pass rates but won't differentiate a person who has taken the class in the past from a new student.

There are also other incentives which are covered quite plainly in the documentary "Waiting for Superman" in the Lemon walk. If you don't fire your lowest performers, and they are effectively guaranteed wages without the appropriate level of work, they end up driving the higher performers out through social coercion, harassment, and corruption. The higher performers make the lower performers look bad.

Xcelerate · 33m ago
I have two lines of thought on this:

1) Chatbots are never going to be perceived as safe or effective as humans by default, primarily due to human fiat. Professionals like counselors (and lawyers, doctors, software engineers, etc.) will always claim that an LLM cannot do their job, namely because acknowledging such threatens their livelihood. Determining whether LLMs genuinely provide therapeutic value to humans would require rigorous, carefully controlled experiments conducted over many years.

2) Chatbots definitely cannot replace human therapists in their current state. That much seems quite obvious to me for various reasons already argued well by others on here. But I had to highlight point #1 as devil's advocate, because adopting the mindset that "humans are inherently better by default" due to some magical or scientifically unjustifiable reason will prevent forward progress. The goal is to eliminate the (quite reasonable) fear people have of eventually losing their job to AI by enacting societal change now rather than denying into perpetuity that chatbots are necessarily inferior, at which point everyone will in fact lose their jobs because we had no plan in place.

hy555 · 6h ago
Throwaway account. My ex partner was involved in a study which said these things were not ok. They were paid not to publish by an undisclosed party. That's how bad it has got.

Edit: the study compared therapist outcomes to AI outcomes to placebo outcomes. Therapists in this field performed slightly better than placebo, which is pretty terrible. The AI outcomes performed much worse than placebo which is very terrible.

neilv · 5h ago
Sounds like suppressing research, at the cost of public health/safety.

Some people knew what the tobacco companies were secretly doing, yet they kept quiet, and let countless family tragedies happen.

What are best channels for people with info to help halt the corruption, this time?

(The channels might be different than usual right now, with much of US federal being disrupted.)

hy555 · 5h ago
Start digging into psychotherapy research and tearing their papers apart. Then the SPR. Whole thing is corrupt to the core. A lot of papers drive public health policy outside the field as it's so vague and easy to cite but the research is only fit for retraction watch.
neilv · 5h ago
Being paid to suppress research on health/safety is potentially a different problem than, say, a high rate of irreproducible results.

And if the alleged payer is outside the field, this might also be relevant to the public interest in other regards. (For example, if they're trying to suppress this, what else are they trying to do. Even if it turns out the research is invalid.)

hy555 · 4h ago
Both are a problem. I should not conflate the two.

I agree. Asking questions which are normal in my own field resulted in stonewalling and obvious distress. The worst thing being this leading to the end of what was a good relationship.

neilv · 4h ago
If the allegation is true, hopefully your friend speaks up.

If not, you might consider whether you have actionable information yourself, any professional obligations you have (e.g., if you work in science/health/safety yourself), any societal obligations, whether reporting the allegation would be betraying a trust, and what the calculus is there.

cjbgkagh · 4h ago
I figured it would be related in that it's a form of p-hacking. Do 20 studies, one gives you the 'statistically significant' results you want, suppress the other 19. Then 100% of published studies support what you want. Could be combined with p-hacking within the studies to compound the effect.
rsynnott · 24m ago
I'm quite curious how the placebo in a study like this works.
ilaksh · 2h ago
Which model exactly? What type of therapy/prompt? Was it a completely dated model, like in the article where they talk about a model from two years ago? We have had massive progress in two years.
raverbashing · 2h ago
Honestly none of the companies are tuning their model to be better at therapy.

Also it is not expected that the training material for the model deals with the actual practical aspects of therapy, only some of the theoretical aspects are probably in that material

jdietrich · 43m ago
>none of the companies are tuning their model to be better at therapy

BrickLabs have developed an expert-fine-tuned model specifically to provide psychotherapy. Their model has shown modestly positive results in a reasonably large preregistered RCT.

https://trytherabot.com/

https://ai.nejm.org/doi/full/10.1056/AIoa2400802

ilaksh · 1h ago
The leading edge models are trainable via instructions. That's why agents are possible. Many online therapy or therapy companies are training or instructing their agents in this domain.
sorenjan · 5h ago
What did they use for placebo? Talking to somebody without education, or not talking to anybody at all?
hy555 · 5h ago
Not talking to anyone at all.
zargon · 5h ago
What did they do then? If they didn't do anything, how can it be considered a placebo?
phren0logy · 4h ago
It's called a "waitlist" control group, and it's not intended to represent placebo. Or at least, it shouldn't be billed that way. It's not an ideal study design, but it's common enough that you could use it to compare one therapy to another based on their results vs a waitlist control. Placebo control for psychotherapy is tricky and more expensive, and can be hard to get the funding to do it properly.

No comments yet

risyachka · 5h ago
Does it matter? The point is AI made it worse.
trod1234 · 4h ago
That seems like a very poor control group.
hy555 · 4h ago
That is one of my concerns.
cube00 · 5h ago
The amount of free money sloshing around the AI space is ridiculous at the moment.
scotty79 · 3h ago
I've heard of some more modern research with llms that had a result that Ai therapist was straight up better than human therapists across all measures.
jdietrich · 31m ago
In the UK (and many other jurisdictions outside the US), psychotherapy is completely unregulated. Literally anyone can advertise their services as a psychotherapist or counsellor, regardless of qualifications, experience or their suitability to work with potentially vulnerable people.

Compared to that status quo, I'm not sure that LLMs are meaningfully more risky - unlike a human, at least it can't physically assault you.

https://www.bacp.co.uk/news/news-from-bacp/2020/6-march-gove...

https://www.theguardian.com/society/2024/oct/19/psychotherap...

ilaksh · 2h ago
"Prof Dame Til Wykes, the head of mental health and psychological sciences at King’s College London, cites the example of an eating disorder chatbot that was pulled in 2023 after giving dangerous advice"

2023 is ancient history in the LLM space. That person is totally out of touch with it.

Also, like most things, especially when they are starting out, the actual details of the implementation matter. For example, for the first few years that SSDs came out, there were a lot of models that were completely unreliable. I had someone tell me they would never trust enterprise data to run on an SSD. At the time, there were a few more expensive models like one of the Intel Extreme something that were robust, but most were not. However, since I had been using that reliable model, he was wrong to insist on going back to a mechanical hard drive. Things change fast, and details matter.

Leading LLMs in 2025 can absolutely do certain core aspects of cognitive behavioral therapy very effectively given the right prompts and framework and things like journaling tools for the user. CBT is actually very practical and logical.

If you take a random cheap inexpensive chat bot with a medium to low parameter count and middling intelligence and a weak prompt that was not written by a subject matter expert, then even with the advances in 2025, you will not get good advice. But if you implement it effectively with a very strong model etc., it will be able to do it.

simplyinfinity · 2h ago
Even today, leading LLMS Claude 3.7 and ChatGPT 4, take your questions as "you've made mistake, fix it" instead of answering the question. People consider a much broader context of the situation, your body language, facial expressions, and can come up with unusual solutions to specific situations and can explore vastly more things than an LLM.

And the thing when it comes to therapy is, a real therapist doesn't have to be prompted and can auto adjust to you without your explicit say so. They're not overly affirming, can stop you from doing things and say no to you. LLMs are the opposite of that.

Also, as a lay person how do i know the right prompts for <llm of the week> to work correctly?

Don't get me wrong, i would love for AI to be on par or better than a real life therapist, but we're not there yet, and i would advise everyone against using AI for therapy.

sho_hn · 1h ago
Even if the tech was there, for appropriate medical use those models would also have to be strenously tested and certified, so that a known-good version is in use. Cf. the recent "personality" changes in a ChatGPT upgrade. Right now, none of these tools is regulated sufficiently to set safe standards there.
ilaksh · 1h ago
I am not talking about a layperson building their own therapist agent from scratch. I'm talking about an expert AI engineer and therapist working together and taking their time to create them. Claude 3.7 will not act in a default way given appropriate instructions. Claude 3.7 can absolutely come up with unusual solutions. Claude 3.7 can absolutely tell you "no".
creata · 1h ago
Have you seen this scenario ("an expert AI engineer and therapist working together" to create a good therapy bot) actually happen, or are you just confident that it's doable?
ilaksh · 1h ago
I've built a therapy agent running my own agent framework with Claude 3.7 based on research into CBT (research aided by my agent). I have verified that the core definition and operation of therapy sessions matches descriptions of CBT that I have been able to find online.

I am very experienced with creating prompts and agents, and good at research, and I believe that my agent along with the journaling tool would be more effective than many "average" human therapists.

It seems effective in dealing with my own issues.

Obviously I am biased.

simplyinfinity · 24m ago
You're verifying your own claims. That's not good enough.

> research aided by my agent Also not good enough.

As an example: Yesterday i asked Claude and ChatGPT to design a circuitry that monitors pulses form S0 power meter interface. It designed a circuit that didn't have any external power to the circuit. When asked it said "ah yes, let me add that" and proceeded to confuse itself and add stuff that are not needed, but are explained and sounds reasonable if you don't know anything. After numerous attempts it didn't produce any working design.

So how can you verify that the therapist agent you've built will work with something as complex as humans, when it can't even do basic circuitry with known laws of physics and spec & data sheets of no more than 10 components?

sho_hn · 1h ago
I assume you realize you're not the first person to self-medicate while conveniently professing to be an expert on medicine.
sho_hn · 1h ago
> Leading LLMs in 2025 can absolutely do certain core aspects of cognitive behavioral therapy very effectively given the right prompts and framework and things like journaling tools for the user.

What makes you qualified to assert this?

(Now, I dislike arguments from authority, but as an engineer in the area of life/safety-critical systems I've also learned the importance of humility.)

ilaksh · 1h ago
If they are an average person who wants to talk something out and get practical advise about issues, it is generally not safety critical, and LLMs can help them.

If they are mentally ill, LLMs cannot help them.

andy99 · 1h ago
The failure modes from 2023 are identical to those today. I agree with the now deleted post that there has been essentially no progress. Benchmark scores (if you think they are a relevant proxy for anything) obviously have increased, but (for example) from 50% to 90% (probably less drastically), not the 99% to 99.999% you'd need for real assurance a widely used system won't make mistakes.

Like in 2023, everything is still a demo, there's nothing that could be considered reliable.

thih9 · 1h ago
> Leading LLMs in 2025 can absolutely do certain core aspects of cognitive behavioral therapy very effectively given the right prompts and framework and things like journaling tools for the user.

But when the situation gets more complex or simply a bit unexpected, would that model reliably recognize it lacks knowledge and escalate to a specialist? Or would it still hallucinate instead?

ilaksh · 1h ago
SOTA models can actually handle complexity. Most of the discussions I have had with my therapy agent do have a lot of layers. What they can't handle is someone who is mentally ill and may need medication or direct supervision. But they can absolutely recognize mental illness if it is evident in the text entered by the user and insist the user find a medical professional or help them search for one.
timewizard · 2h ago
> 2023 is ancient history in the LLM space.

Okay, what specifically has improved in that time, which would allay the doctors specific concerns?

> do certain core aspects

And not others? Is there a delineated list of such failings in the current set of products?

> given the right prompts and framework

A flamethrower is perfectly safe given the right training and support. In the wrong hands it's likely to be a complete and total disaster in record short time.

> a weak prompt that was not written by a subject matter expert

So how do end users ever get to use a tool like this?

ilaksh · 1h ago
The biggest thing that has improved is the intelligence of the models. The leading models are much more intelligent and robust. Still brittle in some ways, but totally capable of giving CBT advise.

The same way end users ever get to use a tool. Open source or an online service, for example.

kbelder · 6h ago
I think a lot of human therapists are unsafe.

We may just need to start comparing success rates and liability concerns. It's kind of like deciding when unassisted driving is 'good enough'.

th0ma5 · 2h ago
That's not exactly a following reasoning to use for LLMs ... In automation studies things are most dangerous just before full automation due to bias. Why tap the brakes when surly the car will do it on its own when that isn't a guarantee.
timewizard · 2h ago
The therapist controls the extent of the relationship which determines profits. A disinterested third party should be involved.
pavel_lishin · 3h ago
A recent Garbage Day newsletter spoke about this as well, worth reading: https://www.garbageday.email/p/this-is-what-chatgpt-is-actua...
drdunce · 5h ago
As with many things in relation to technology, perhaps we simply need informed user choice and responsible deployment. We could start by not using "Artificial Intelligence" - that makes it sound like a some infallible omniscient being with endless compassion and wisdom that can always be trusted. It's not intelligent, it's a large language model, a convoluted next word prediction machine. It's a fun trick, but shouldn't be trusted with Python code, let alone life advice. Armed with that simple bit of information, the user is free to choose how they use it for help, whether it be medical, legal, work etc.
trial3 · 5h ago
> simply need informed user choice and responsible deployment

the problem is that "responsible deployment" feels extremely at odds with, say, needing to justify a $300B valuation

EA-3167 · 5h ago
What we need is the same thing we've needed for a long time now, ethical standards applied across the whole industry in the same way that many other professions are regulated. If civil engineers acted the way that software engineers routinely do, they'd never work again, and rightly so.
HPsquared · 6h ago
Sometimes an "unsafe" option is better than the alternative of nothing at all.
tredre3 · 5h ago
Sometimes an "unsafe" option is not better than the alternative of nothing at all.
Y_Y · 5h ago
Sounds like we need more information than safe/not safe to make a sensible decision!

This is something that bugs me about medical ethics, that it's more important not to cause any harm than it is to prevent any.

jrapdx3 · 2h ago
Actually, concern about doing harm is central to current concepts of medical ethics. The idea may be ancient but still highly relevant. Ethics declare a primary obligation of healers is "above all do no harm".

That of course doesn't exclude doing good, being helpful, using skills and technologies to produce favorable outcomes. It does mean that healers must exercise due vigilance for unintended adverse consequences of therapies, let alone knowingly providing services that cause harm.

The problem with "safe/not safe" designation is simply that these states are more often than not indistinct. Or put another way, it depends on subtle contextual attributes that are hard to discern. Furthermore individual differences can make it difficult to predict safety of applying a procedure.

As a result healers should be cautious in approaching problems. Definitely prevention is better than cure, it's simply that relatively little is known about preventing burdensome conditions. Exercising what is known is a high priority.

bildung · 5h ago
I you look at the horrible things that happened in medical history, e.g. https://en.wikipedia.org/wiki/Tuskegee_Syphilis_Study it's pretty clear why the ethics care more about not causing harm...

No comments yet

bigmattystyles · 6h ago
The problem is they are cheap and immediately available.
distalx · 6h ago
It just feels a bit uncertain trusting our feelings to AI we don't truly understand.
jobigoud · 5h ago
You don't truly understand the human therapist either.
codr7 · 2h ago
You do however have a hell of a lot more in common with them than with a profit driven algorithm that even its creators have no clue how it really works.
AaronAPU · 1h ago
The thing about all these arguments is they all apply to humans. We are all an opaque mess of conflicts of interests, inconsistencies and bias.

Not sure if people aren’t thinking that through or if they’re vastly overestimating the trustworthiness and transparency of your average professional human.

squigz · 2h ago
> even its creators have no clue how it really works.

What does this mean?

codr7 · 2h ago
Not having that discussion, go argue with someone else.
52-6F-62 · 3h ago
They aren’t truly cheap
codr7 · 2h ago
Not even close, it's the most expensive waste of resources I can think of atm.

We used to worry about Bitcoin, now Google is funding nuclear plants.

deadbabe · 2h ago
I used ChatGPT for therapy and it seems fine, I feel like it helped, and I have plenty of things fucked up about myself. Can’t be much worse than other forms of “therapy” that people chase.
j45 · 3h ago
Where the experts are the ones who's incomes would be threatened, there is likely some merit in what they're saying, but also some digital literacy skills.

I don't know that AI "advisory" chatbots can replace humans.

Could they help an individual organize their thoughts for more productive time with professionals? Probably.

Could such tech help individuals learn about different terminology, their usage and how to think about it? Probably.

Could there be .. a net results of spending fewer hours (and cost if the case) for the same progress? And be able to make it further with advice into improvement?

Maybe the baseline of advisory expertise in any field exists more around the beginner stage than not.

codr7 · 2h ago
You see the same thing with coding. People with actual experience and enough of a perspective to see the problems are ignored because obviously they're just afraid to lose their jobs. Which is not true, it's not even on my list of things that I should be aware of.

Experience matters, that's something we seem to be forgetting fast.

rdm_blackhole · 4h ago
I think the core of the problem here is that the people who turn to chat bots for therapy sometimes have no choice as getting access to a human therapist is simply not possible without spending a lot of money or waiting 6 months before a spot becomes available.

Which begs the question, why do so many people currently need therapy? Is it social media? Economic despair? Or a combination of factors?

HaZeust · 4h ago
I always liked the theory that we're living in an age where all of our needs can be reasonably met, and we now have enough time to think - in general. We're not working 12 hour days on a field, we're not stalking prey for 5 miles, we have adequate time in our day-to-day to think about things - and ponder - and reflect; and the ability to do so leads to thoughts and epiphanies in people that therapy helps with. We also have more information at our disposal than ever, and can see new perspectives and ideas to combat and cope with - that one previously didn't need to consider or encounter.

We've also stigmatized a lot of the things that folks previously used to cope (tobacco, alcohol), and have loosened our stigma on mental health and the management thereof.

mrweasel · 4h ago
> we have adequate time in our day-to-day to think about things - and ponder - and reflect;

I'd disagree. If you worked in the fields, you have plenty of time to think. We fill out every waking hour of our day, leaving no time to ponder or reflect. Many can't even find time to workout and if they do they listen to a podcast during their workout. That's why so many ideas come to us in the shower, it's the only place left where we don't fill out minds with impressions.

52-6F-62 · 3h ago
Indeed. I had way more time to think working a factory kine than I have had in any other white collar role.
squigz · 2h ago
I think GP means more that we generally don't have to worry about survival on a day to day (or seasonal) basis anymore, so we have more time to think about bigger issues, like politics or social issues - which I agree with, personally.
layer8 · 34m ago
How do you figure that it’s “currently”, and the need hasn’t always been there more or less?
mrweasel · 4h ago
Probably a combination of things, I wouldn't pretend to know, but I have my theories. For men, one half-backed thought I've been having revolved around social circles, friends and places outside work or home. I'm a member in a "men only" sports club (we have a few exceptions due to a special program, but mostly it's men only). One of the older gentlemen, probably in his early 80s, made the comment: "It's important for men to socialise with other men, without women. Young and old men have a lot in common, and have a lot to talk about. An 18 year old woman, and an 80 year old man have very little in of shared interests or concerns."

What I notice is that the old members keep the younger members engaged socially, teach them skills and give them access to their extensive network of friends, family, previous (or current) co-workers, bosses, managers. They give advise, teach how to behave and so on. The younger members help out with moving, help with technology, call an ISP, drive others home, to the hospital and help maintain the facilities.

Regardless of age, there's always some dude you can talk to, or knows who you need to talk to, and sometimes there's even someone who knows how to make your problems go away or take you in if need by.

A former colleague had something similar, a complete ready so go support network in his old-boys football team. Ready to support in anyway they could, when he started his own software company.

The problem: This is something like 250 guys. What about the rest? Everyone needs a support network, if your alone, or your family isn't the best, you only have a few superficial friends, if any, then where do you go? Maybe the people around you aren't equipped to help you with your problems, not everyone is, some have their own issues. The safe spaces are mostly gone.

We can't even start up support networks, because the strongest have no reason to go, so we risk creating networks of people dragging each other down. The sports clubs works because members are from a wider part of society.

From the article:

> > Meta said its AIs carry a disclaimer that “indicates the responses are generated by AI to help people understand their limitations”.

That's a problem, because most likely to turn to an LLM for mental support don't understand the limitations. They need strong people to support and guide them, and maybe tell them that talking to a probability engine isn't the smartest choice, and take them on a walk instead.

more_corn · 1h ago
But it’s probably better than no therapy at all.
James_K · 5h ago
Respectfully, no sh*t. I've talked to a few of these things, and they are feckless yes-men. It's honestly creepy, they sound like they want something from you. Which I suppose they do: continual use of their services. I know a few people who use these things for therapy (I think it is the most popular use now) and I'm downright horrified at the sort of stuff they say. I even know a person who uses the AI to date. They will paste conversations from apps into the AI and ask it how to respond. I've set a rule for myself; I will never speak to machines. Sure, right now it's obvious that they are trying to inflate my ego and keep using the service, but one day they might get good enough to trick me. I already find social media algorithms quite addictive, and so I have minimise them in my life. I shudder to think what a trained agent like these may be capable of.
52-6F-62 · 3h ago
I’ve also experimented with them in that capacity. I like to know first hand. I play the skeptic but I tend to feed the beast a little blood in order to understand it, at least.

As a result, I agree with you.

It gives me pause when I stop to think about anyone without more context placing so much trust in these. And the developers engaged in the “industry” of it demanding blind faith and full payment.

bitwize · 1h ago
I dunno, man, M-x doctor made me take a real hard long look at my life.
Buttons840 · 5h ago
Interacting with a LLM (especially one running locally) can do something a therapist cannot--provide an honest interaction outside the capitalist framework. The AI has its limitations, but it is an entity just being itself doing the best it can, without expecting anything in return.
kurthr · 5h ago
The word "can" is doing a lot of work here. The idea that any of the current "open weights" LLMs are outside the capitalist framework stretches the bounds of credulity. Choose the least capitalist of: OpenAI, Google, Meta, Anthropic, DeepSeek, Alibaba.

You trust Anthropic that much?

Buttons840 · 5h ago
I said the interaction exists outside of any financial transaction.

Many dogs are produced by profit motive, but their owners can have interactions with the dog that are not about profit.

andy99 · 1h ago
Dogs aren't rlhf'd and fine tuned to enforce behaviors designed by companies.
trod1234 · 3h ago
With respect, I think you should probably re-examine the meaning of the words you use here. You use words in a way that doesn't meet their established definition.

It would meet objective definition if you replaced 'capitalist' with 'socialist', which may have been what you meant, but that's merely an observation I make, not what you actually say.

The entire paragraph is quite contradictory, and lacks truth, and by extension it is entirely unclear what you mean, and it appears like you are confused when you use words and make statements that can't meet their definition.

You may want to clarify what you mean.

In order for it to be 'capitalist' true to its definition, you need to be able to achieve profit with it in purchasing power, but the outcomes of the entire business lifecycle resulting from this, taken as a whole, instead destroy that ability for everyone.

The companies involved didn't start on their merits seeking profit, they were funded by non-reserve debt issuance or money-printing which is the state picking winners and losers.

If they were capitalist they wouldn't have released model weights to the public. The only reason you would free a resource like that is if your goal was something not profit-driven (i.e. contagion towards chaos to justify control or succinctly totalism).

rochav · 3h ago
I think operating under the assumption that AI is an entity bring itself and comparing it to dogs is not really accurate. Entities (not as in legal, but in the general sense) are beings, living beings that are capable of emotion, of thought and will, are they not? Whether dogs are that could be up to debate (I think they are, personally), but whether language models are that is just is not. The notion very notion that they could be any type of entity is directly tied to the value the companies that created it have, it is part of the hype and capitalist system and I, again personally, don't think anyone could ever turn that into something that somehow ends up against capitalism just because the AI can't directly want something in return for you. I understand the sentiment and the distrust of the mental health care apparatus, it is expensive, it is tied to capitalism, it depends on trusting someone that is being paid to influence your life in a very personal way, but it's still better than trusting it on the judgment of a conversational simulation that is incapable of it, incapable of knowing you and observing you (not just what is written, but how you physically react to situations or to the retelling, like tapping your foot or disengaging) and understanding nuance. Most people would be better served talking to friends (or doing their best trying to make friends they can trust if they don't have any), and I would argue that people supporting people struggling is one way of truly opposing capitalism.
Buttons840 · 1h ago
Feel free to substitute in whatever word you think matches my intent best then. You seem to understand my intent well enough--I'm not interested in discussing the definition of individual words though.
delichon · 5h ago
How is it possible for a statistical model calculated primarily from the market outputs of a capitalist society to provide an interaction outside of the capitalist framework? That's like claiming to have a mirror that does not reflect your flaws.
NitpickLawyer · 4h ago
If I understand what they're saying, the interactions you have with the model are not driven by "maximising eyeballs/time/purchases/etc". You get to role-play inside a context window, and if it went in a direction you don't like you reset and start over again. But during those interactions, you control whatever happens, not some 3rd party that may have ulterior motives.
Buttons840 · 5h ago
The same way an interaction with a pure bread dog can be. The dog may have come from a capitalistic system (dogs are bred for money unfortunately), but your personal interactions with the dog are not about money.

I've never spoken to a therapist without paying $150 an hour up front. They were helpful, but they were never "in my life"--just a transaction--a worth while transaction, but still a transaction.

germinalphrase · 5h ago
It’s also very common for people to get therapy at free or minimal cost (<$50) when utilizing insurance. Long term relationships (off and on) are also quite common. Whether or not the therapist takes insurance is a choice, and it’s true that they almost always make more by requiring cash payment instead.
amanaplanacanal · 4h ago
The dogs intelligence and personality were bred long before our capitalist system existed, unlike whatever nonsense an LLM is trying to sell you.
tuyguntn · 5h ago
I think you are right, on one hand we have human beings with own emotions in life and based on their own emotions they might impact negatively others emotion

on the other hand probabilistic/non-deterministic model, which can give 5 different advises if you ask 5 times.

So who do you trust? Until determinicity of LLM models gets improved and we can debug/fix them while keeping their deterministic behavior intact with new fixes, I would rely on human therapists.

phreno · 3h ago
Life is the leading cause of death. Seems propagating the species is harmful to our health.

Guess we should stop?