Appreciating that not everyone tries to optimise for LLMs and we are still doing things like this. If you're looking at HN alone, it sometimes feels like the hype could drown out everything else.
danielbln · 5h ago
There is massive hype, no doubt about it, but lets also not forget how LLMs have basically solved NLP, are a step change in many dimensions and are disrupting and changing things like software engineering like nothing else before it.
So I hear you, but on the flip side we _should_ be reading a lot about LLMs here, as they have a direct impact on the work that most of us do.
That said, seeing other papers pop up that are not related to transformer based networks is appreciated.
larodi · 2h ago
Thank you, brother. Besides not all that goes in HN is strictly LLM, really dunno why the scare.
karanveer · 4h ago
I couldnt agree more.
fjfaase · 4h ago
Nice that they can do the processing in the GHz range, but from some pictures in the paper, it seems the system has only 60 'cells', which is rather low compared to the number of cells found in brains of animals that display complex behavior. To me it seems this is an optimization in the wrong dimension.
_jab · 4h ago
I suspect practicality is not the goal here, but rather a proof of concept. Perhaps they saw speed as an important technical barrier to cross
khalic · 1h ago
A lot of unrigorous claims for an abstract…
msgodel · 5h ago
It's just a single linear layer and it's not clear to me that the technology is capable of anything more. If I'm reading it correctly it sounds like running the model forward couldn't even use the technology, they had to record the weights and do it the old fashion way.
roflmaostc · 4h ago
Would you have discredited early AI work because they could only train and compute a couple of weights?
This is about first prototypes and scaling is often easier than the basic principle.
msgodel · 2h ago
Is this actually capable of propagating the gradient and training more complex layers though?
A lot of these novel AI accelerators run into problems like that because they're not capable of general purpose computing. A good example of that are the boltzman machines on Dwave's stuff. Yeah it can do that but it can only do that because the machine is only capable of doing QUBO.
roflmaostc · 1h ago
For inference we do not care about training, right?
But if we could make cheaper inference machines available, everyone would profit.
Isn't it that LLMs use more energy in inference than training these days?
So I hear you, but on the flip side we _should_ be reading a lot about LLMs here, as they have a direct impact on the work that most of us do.
That said, seeing other papers pop up that are not related to transformer based networks is appreciated.
This is about first prototypes and scaling is often easier than the basic principle.
A lot of these novel AI accelerators run into problems like that because they're not capable of general purpose computing. A good example of that are the boltzman machines on Dwave's stuff. Yeah it can do that but it can only do that because the machine is only capable of doing QUBO.
But if we could make cheaper inference machines available, everyone would profit. Isn't it that LLMs use more energy in inference than training these days?