Ask HN: Why hasn't x86 caught up with Apple M series?
431 points by stephenheron 2d ago 614 comments
Ask HN: Best codebases to study to learn software design?
103 points by pixelworm 4d ago 90 comments
LLMs solving problems OCR+NLP couldn't
18 universesquid 15 8/28/2025, 1:15:28 PM cloudsquid.substack.com ↗
While Gemini is nice, it would be nice to have a pipeline that works locally on a reasonably RAM’d unified memory Mac or Framework AMD board.
[1] https://www.bbc.com/news/technology-23588202
[2] https://www.dkriesel.com/en/blog/2013/0810_xerox_investigati...
I think that's more because of the current state of the industry, a lot of those models are either internal, paywall locked or annoying to use. I don't want to waste effort in trying to sign up for a 4 week trail of X service to perform a one off task.
Unfortunately, this post didn't really elucidate or go into an interesting topic within this space.
I'm not expecting a research paper, but it would be great to get some stats, graphs, examples and meat on the bones. I opened this up expecting some actual examples of problems within OCR & NLP and showing how X multi-modal model solves them.
Current 80/20-rule-ignoring AI dogma in a nutshell.
And I assume the multimodal tools still use OCR for text extraction, or am I missing something?
My understanding is that they're still doing OCR+NLP, just differently than traditional approaches.