Who does your assistant serve?

43 todsacerdoti 13 8/17/2025, 3:14:02 PM xeiaso.net ↗

Comments (13)

diggan · 3m ago
> Not to mention, industry consensus is that the "smallest good" models start out at 70-120 billion parameters. At a 64k token window, that easily gets into the 80+ gigabyte of video memory range, which is completely unsustainable for individuals to host themselves.

Worth a tiny addendum, GPT-OSS-120b (at mxfp4 with 131,072 context size) lands at about ~65GB of VRAM, which is still large but at least less than 80GB. With 2x 32GB GPUs (like R9700, ~1300USD each) and slightly smaller context (or KV cache quantization), I feel like you could fit it, and becomes a bit more obtainable for individuals. 120b with reasoning_effort set to high is quite good as far as I've tested it, and blazing fast too.

fleebee · 6m ago
What's worth noting is that the companies providing LLMs are also strongly pushing people into using their LLMs in unhealthy ways. Facebook has started shoving their conversational chatbots into people's faces.[1] That none of the big companies are condemning or blocking this kind of LLM usage -- but are in fact advocating for it -- is telling of their priorities. Evil is not a word I use lightly but I think we've reached that point.

[1]: https://www.reuters.com/investigates/special-report/meta-ai-...

diggan · 1m ago
> Evil is not a word I use lightly but I think we've reached that point.

It was written in sand as soon as Meta started writing publicly about AI Personalities/Profiles on Instagram, or however it started. If I recall correctly, they announced it more than two years ago?

Aurornis · 25m ago
> I feel like this should go without saying, but really, do not use an AI model as a replacement for therapy.

I know several people who rave about ChatGPT as a pseudo-therapist, but from the outside the results aren’t encouraging. They like the availability and openness they experience by taking to a non-human, but they also like the fact that they can get it to say what they want to hear. It’s less of a therapist and more of a personal validation machine.

You want to feel like the victim in every situation, have a virtual therapist tell you that everything is someone else’s fault, and validate choices you made? Spend a few hours with ChatGPT and you learn how to get it to respond the way you want. If you really don’t like the direction a conversation is going you delete it and start over, reshaping the inputs to steer it the way you want.

Any halfway decent therapist will spot these behaviors and at least not encourage them. LLM therapists seem to spot these behaviors and give the user what they want to hear.

Note that I’m not saying it’s all bad. They seem to help some people work through certain issues, rubber duck debugging style. The trap is seeing this success a few times and assuming it’s all good advice, without realizing it’s a mirror for your inputs.

Hackbraten · 41m ago
> At a 64k token window, that easily gets into the 80+ gigabyte of video memory range, which is completely unsustainable for individuals to host themselves.

A desktop computer in that performance tier (e.g. an AMD AI Max+ 395 with 128 GB of shared memory) is expensive but not prohibitively so. Depending on where you live, one year of therapy may cost more than that.

jchw · 31m ago
It seems like the Framework Desktop has become one of the best choices for local AI on the whole market. At a bit over $2,000 you can get a machine that can have, if I understand correctly, around 120 GiB of accessible VRAM, and the seemingly brutal Radeon 8060S, whose iGPU performance appears to only be challenged by a fully loaded Apple M4 Max, or of course a sufficiently big dGPU. The previous best options seem to be Apple, but for a similar amount of VRAM I can't find a similarly good deal. (The last time I could find an Apple Silicon device that sold for ~$2,000 with that much RAM on eBay, it was an M1 Ultra.)

I am not really dying to run local AI workloads, but the prospect of being able to play with larger models is tempting. It's not $2,000 tempting, but tempting.

layer8 · 9m ago
There are a dozen or more (mostly Chinese) manufacturers coming out with mini PCs based on that Ryzen AI Max+ 395 platform, like for example the Bosgame M5 AI Mini for just $1699 with 128GB. Just pointing out that this configuration is not a Framework exclusive.
Aurornis · 23m ago
FYI there are a number of Strix Halo boards and computers out in the market already. The Framework version looks to be high quality and have good support, but it’s not the only option in this space.

Also take a good hard look at the token output speeds before investing. If you’re expecting quality, context windows, and output speeds similar to the hosted providers you’re probably going to be disappointed. There are a lot of tradeoffs with a local machine.

walterbell · 14m ago
HP Z2 Mini G1a with 128GB and Strix Halo is ~$5K, https://www.notebookcheck.net/Z2-Mini-G1a-HP-reveals-compara...
alistairSH · 57m ago
I can’t help but think we’re accelerating our way to a truly dystopian future. Like Bladerunner, but worse, maybe.
troupo · 42m ago
We're already in early stages of Bladerunner.
bit1993 · 22m ago
We should not forget that LLMs simply replicate the data humans have put on the WWW. LLM tech could only have come from Google search, who indexed and collected the entire data on the WWW and the next step was to develop algorithms to understand the data and give better search results. This also shows the weakness of LLMs, they depend on human data and as LLM companies continue to try to replace humans, the humans will simply stop feeding LLMs their data, more and more data will go behind paywalls, more code will become closed source, simple supply and demand economics. LLMs cannot make progress without new data because the world-culture moves rapidly in real-time.
walterbell · 18m ago
> LLMs cannot make progress without new data because the world-culture moves rapidly in real-time.

This helps services where users generate content, reducing licensing cost and latency of accessing external content.