Makeimageblackandwhite.com – Free Online Black and White Image Converter (makeimageblackandwhite.com)

Isn't this a software problem being solved in hardware? Ideally you would try to avoid going to memory in the first place by fusing the operations, which should be much faster than speeding up memory ops. E.g. you should never do an explicit im2col before a convolution, it should be fused. However it's hard to argue with a 0.019 mm2 area increase.

mikewarot · 31m ago

The only memory involved should be at the input and output of a pipeline stage that does an entire layer of an LLM. I'm of the opinion that we'll end up with effectively massive FPGAs with some stages of pipelining that have NO memory access internally, so that you get one token per clock cycle.

100 million tokens per second is currently worth about $130,000,000/day. (Or so ChatGPT 4.1 told me a few days ago)

I'd like to drop that by a factor of at least 1000:1

KnuthIsGod · 4h ago

Cutting edge and innovative AI hardware research from China.

Looks like Amerikan sanctions are driving a new wave of innovation in China.

" This work addresses that gap by introducing the Ten- sor Manipulation Unit (TMU): a reconfigurable, near-memory hardware block designed to execute data-movement-intensive (DMI) operators efficiently. TMU manipulates long datastreams in a memory-to-memory fashion using a RISC-inspired execution model and a unified addressing abstraction, enabling broad support for both coarse- and fine-grained tensor transformations.

The proposed architecture integrates TMU alongside a TPU within a high-throughput AI SoC, leveraging double buffering and output forwarding to improve pipeline utilization. Fab- ricated in SMIC 40 nm technology, the TMU occupies only 0.019 mm2 while supporting over 10 representative TM operators. Benchmarking shows that TMU alone achieves up to 1413.43× and 8.54× operator-level latency reduction over ARM A72 and NVIDIA Jetson TX2, respectively.

When integrated with the in- house TPU, the complete system achieves a 34.6% reduction in end-to-end inference latency, demonstrating the effectiveness and scalability of reconfigurable tensor manipulation in modern AI SoCs."

Interesting Daily Game (timdle.com)

AI's Biggest Threat: Young People Who Can't Think (wsj.com)

Ask HN: Can Distilling an Open Source Model Be a Research Paper?

Makeimageblackandwhite.com – Free Online Black and White Image Converter (makeimageblackandwhite.com)

Low Overhead Allocation Sampling in a Garbage Collected Virtual Machine (arxiv.org)

Job Is Inherently Wrong (cosmic-cow.mataroa.blog)

I built an AI that fixes the bugs you shouldn't be fixing (cloudgrip.ai)

Open Sesame on the Security and Memorability of Verbal Passwords (ieeexplore.ieee.org)

Show HN: PicturaCalendar2025 – A calendar that blends photos, text (github.com)

Eyeglasses with Built-In Hearing Aids: This Just Makes Sense (wsj.com)

Cocona World (invisibleup.com)

Telescopes Are Tries: A Dependent Type Shellac on SQLite (philipzucker.com)

India wants its own EV market, but needs China to get there (restofworld.org)

Tesla launches robotaxi service in Austin with $4.20 flat fee (fortune.com)

Show HN: Free Foreign Exchange and Crypto Rates API (unirateapi.com)

Our New AI-Native Generation (forbes.com)

Painting restoration with a digitally constructed polymer mask (arstechnica.com)

San Francisco's Billboards Aren't for You (bayareacurrent.com)

Show HN: Windowfied

Aircraft Carrier Submarines; Surcouf, I-400 and more [video] (youtube.com)

My First Open Source AI Generated Library (lucumr.pocoo.org)

Show HN: Free AI Tool Makes Viral "Cat Diving" Clips in Seconds (vidaigen.github.io)

Today it's cheating, tomorrow it's fair. Cluely for B2B mystery shopping (hovrai.com)

Geo-Strategy #8: The Iran Trap [video] (youtube.com)

Show HN: Wtmf.ai Your AI companion that understands (wtmf.ai)

Ask HN: Using AI daily but not seeing productivity gains – is it just me?

Iran threatened attacks by sleeper cells inside U.S. if it was attacked (msnbc.com)

RAG is the way about retrieval, agent, and grounding truth (psiace.me)

RAG in Coding Agents: Making Smarter Programming Assistants (psiace.me)

Another What to Watch App (what2watch2.vercel.app)

Faking Relativity (tiffnix.com)

Show HN: SX – Transfer files from within SSH sessions without reconnecting (github.com)

Dropping the First Atomic Bombs (theguardian.com)

Are we overfitting our code to trends instead of problems?

Nano-Vllm: lightweight vLLM implementation built from scratch (github.com)

Are we overfitting our code to trends instead of problems?

Framework for Skill Learning

Openmovement – Watchmaking 2.0 (openmovement.org)

FedEx founder Fred Smith dies at 80 (apnews.com)

Show HN: A zero-config HTML report plugin for Pytest (single file, CI-friendly)

Product Management: The Good, the Hard, and How to Know If It's Right for You (elezea.com)

Implicit is better than explicit (sophiabits.com)

What I learned recording hours of teens on their phones (theguardian.com)

Compressing for the Browser in Go (blog.kowalczyk.info)

I built an app to backup Live Photos from iPhone to external hard drives

Unlocking Efficiency: Asus IoT Drives Smart Factory Development Through AI (iot.asus.com)

Vera Rubin Scientists Reveal Telescope's First Images (nytimes.com)

Converting Sourcemaps to Original JavaScript/TypeScript Sourcecode (yasoob.me)

The Sad Case of the Youngest-Ever Alzheimer's Diagnosis (sciencealert.com)

LeetCode for System Design (leetsys.dev)

Tensor Manipulation Unit (TMU): Reconfigurable, Near-Memory, High-Throughput AI

Comments (3)