On-device small language models with multimodality, RAG, and Function Calling (developers.googleblog.com)

Did you investigate other search processes besides SGD? I'm thinking of those often termed "biologically plausible" (e.g. forward-forward, FA). Are their internal representations closer to the fractured or unified representations?

goldemerald · 1h ago

This is an interesting line of research but missing a key aspect: there's (almost) no references to the linear representation hypothesis. Much work on neural network interpretability lately has shown individual neurons are polysemantic, and therefore practically useless for explainability. My hypothesis is fitting linear probes (or a sparse autoencoder) would reveal linearly semantic attributes.

It is unfortunate because they briefly mention Neel Nanda's Othello experiments, but not the wide array of experiments like the NeurIPS Oral "Linear Representation Hypothesis in Language Models" or even golden gate Claude.

akarshkumar0101 · 34m ago

We mention this issue exactly in the fourth paragraph in Section 4 and in Appendix F!

goldemerald · 23m ago

That is addressing the incomprehensibility of PCA and applying a transformation to the entire latent space. I've never found PCA to be meaningful for deep learning. As far as I can tell, polysemous issue with neurons cannot be addressed with a single linear transformation. There is no sparse analysis (via linear probes or SAEs) and hence the unaddressed issue.

ipunchghosts · 50m ago

Is what your saying imply that there is a rotation matrix you can apply to each activation output to make it less entangled?

goldemerald · 43m ago

Not quite. For an underlying semantic concept (e.g., smiling face), you can go from a basis vector [0,1,0,...,0] to the original latent space via a single rotation. You could then induce said concept by manipulating the original latent point by traversing along that linear direction.

ipunchghosts · 38m ago

I think we are saying the same thing. Please correct me though where I am wrong. You could look at the maps in some way but instead of the basis being one hot dimensions (the standard basis), it could be rotated.

akarshkumar0101 · 33m ago

We mention this issue exactly in the fourth paragraph in Section 4 and in Appendix F!

ipunchghosts · 48m ago

I am glad they evaluated this hypothesis using weight decay which is primarily thought of to induce a structured representation. My first thought was that the entire paper was useless if they didn't do this experiment.

I find it rather interesting that the structured representations go from sparse to full to sparse as a function of layer depth. I have noticed that applying weight decay penalty as an exponential function of layer depth gives improved results over using a global weight decay.

timewizard · 1h ago

> Much of the excitement in modern AI is driven by the observation that scaling up existing systems leads to better performance.

Scaling up almost always leads to better performance. If you're only getting linear gains though then there is absolutely nothing to be excited about. You are in a dead end.

akarshkumar0101 · 3h ago

Tweet: https://x.com/kenneth0stanley/status/1924650124829196370 Arxiv: https://arxiv.org/abs/2505.11581

pvg · 3h ago

Sounds like you're one of the co-authors? Probably worth mentioning if the case so people know they can discuss the work with one of the work-doers.

akarshkumar0101 · 2h ago

I mentioned that in the original post, but I don't see that text here anymore (thats why I added links via comment)... I am new to hackernews

messe · 1h ago

I believe they just mean that you should edit the comment where you added the links to mention that you are the author, to add that additional context.

pvg · 1h ago

I just meant 'it's good for people to know one of the authors is in the thread because it makes for more interesting conversation'. Clearly did not figure out how to do that without starting a bunch of meta!

macintux · 1h ago

I believe this could (or should) have been a Show HN, which would have allowed you to include explanatory text. See the top of this page for the rules.

https://news.ycombinator.com/show

Welcome to the site. There are a lot of features which are less obvious, which you’ll discover over time.

pvg · 1h ago

Reading material usually can't be a Show HN but you can just post your work without that and say you're involved.

macintux · 1h ago

The repo includes runnable code.

> Show HN is for something you've made that other people can play with… On topic: things people can run on their computers or hold in their hands

pvg · 45m ago

A lot of writing includes runnable code and isn't a Show HN. It's a comparatively narrow category.

ipunchghosts · 45m ago

I am interested in doing research like this. Is there any way I can be a part of it or a similar group? I have been fighting for funding from DoD for many years but to no avail so I largely have to do this research on my own time or solve my current grant's problems so that i can work on this. In my mind, this kind of research is the most interesting and important right now in the deep learning field. I am a hard worker and a high-throughput thinking... how can i get connected to otherwise with a similar mindset?

Introducing Veo 3 and Imagen 4, and a new tool for filmmaking called Flow (blog.google)

Announcing Gemma 3n preview: powerful, efficient, mobile-first AI (developers.googleblog.com)

Deep Learning Is Applied Topology (theahura.substack.com)

Robin: A multi-agent system for automating scientific discovery (arxiv.org)

Show HN: 90s.dev - game maker that runs on the web (90s.dev)

Show HN: A Tiling Window Manager for Windows, Written in Janet (agent-kilo.github.io)

The Dawn of Nvidia's Technology (blog.dshr.org)

The NSA Selector (github.com)

Show HN: Juvio – UV Kernel for Jupyter (github.com)

27000 Dragons and 10'000 Lights: GPU-Driven Clustered Forward Renderer (logdahl.net)

Ashby (YC W19) Is Hiring Engineering Managers (ashbyhq.com)

On-device small language models with multimodality, RAG, and Function Calling (developers.googleblog.com)

The emoji problem (2022) (artofproblemsolving.com)

Google AI Ultra (blog.google)

Making Video Games (Without an Engine) in 2025 (noelberry.ca)

Show HN: Olelo Foil - NACA Airfoil Sim (foil.olelohonua.com)

Teachable Machine (teachablemachine.withgoogle.com)

The Fractured Entangled Representation Hypothesis (github.com)

Production tests: a guidebook for better systems and more sleep (martincapodici.com)

The Lisp in the Cellar: Dependent types that live upstairs [pdf] (zenodo.org)

The Last Letter (aeon.co)

Gail Wellington, Commodore Software Prod Mgr "and Mother of CDTV", Has Died (legacy.com)

A simple search engine from scratch (bernsteinbear.com)

Google is quietly giving Amazon a leg up in digital book sales (washingtonpost.com)

Launch HN: Opusense (YC X25) – AI assistant for construction inspectors on site

llm-d, Kubernetes native distributed inference (llm-d.ai)

Show HN: Astra – a new js2exe compiler (github.com)

Red Programming Language (red-lang.org)

OpenAI Codex Review (zackproser.com)

DDoSecrets publishes 410 GB of heap dumps, hacked from TeleMessage (micahflee.com)

Compiling OCaml to the TI-84 CE Calculator (farlow.dev)

Effort and Challenges in Building Embedded Audio DSP Software Across Platforms (switchboard.audio)

Hypervisor as a Library (seiya.me)

Jules: An Asynchronous Coding Agent (jules.google)

Gemma 3n preview: powerful, efficient, mobile-first AI (developers.googleblog.com)

KumoRFM: Gen-purpose model for making instant predictions over relational data (kumo.ai)

Gemini 2.5: Our most intelligent models are getting even better (blog.google)

Show HN: Text to 3D simulation on a map (does history pretty well) (mused.com)

Finland announces migration of its rail network to international gauge (trenvista.net)

The Windows Subsystem for Linux is now open source (blogs.windows.com)

Show HN: JavaFactory – IntelliJ plugin to generate Java code (github.com)

I got fooled by AI-for-science hype–here's what it taught me (understandingai.org)

Autopsy of an LHC Beam Dump (home.cern)

What are people doing? Live-ish estimates based on global population dynamics (humans.maxcomperatore.com)

Have I Been Pwned 2.0 (troyhunt.com)

Launch HN: Better Auth (YC X25) – Authentication Framework for TypeScript

Kilo: A text editor in less than 1000 LOC with syntax highlight and search (github.com)

Biff – a batteries-included web framework for Clojure (biffweb.com)

Gamers Nexus to open an investigation into Nvidia's shady business tactics (youtube.com)

RepoRoulette: Randomly sample repositories from GitHub (github.com)

The Fractured Entangled Representation Hypothesis

Comments (20)