Google will allow only apps from verified developers to be installed on Android (9to5google.com)

They mention promising results on Apple Silicon GPUs and even cite the contributions from Vello, but I don't see a Metal implementation in there and the benchmark only shows results from an RTX 2080. Is it safe to assume that they're referring to the WGPU version when talking about M-series chips?

m-schuetz · 1h ago

That and https://github.com/b0nes164/GPUSorting have been a tremendous help for me, since CUB does not nicely work with the Cuda Driver Api. The author is doing amazing work.

coffeeaddict1 · 3h ago

Related paper by the authors: https://dl.acm.org/doi/10.1145/3694906.3743326

genpfault · 4h ago

https://en.wikipedia.org/wiki/Prefix_sum#Applications

almostgotcaught · 4h ago

this is missing the most important one (in today's world): extracting non-zero elements from a sparse vector/matrix

https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-co...

merope14 · 3h ago

Not even close. The most important application (in today's world) is radix sort.

WJW · 1h ago

What specific application do you have in mind that radix sort is more important than matrix multiplication?

otherjason · 25m ago

I think they were trying to say “radix sort is a more important application of prefix sum than extraction of values from a sparse matrix/vector is.”

woadwarrior01 · 25m ago

Top K sampling comes to mind, although it's nowhere nearly as important as matmult.

almostgotcaught · 5m ago

ranking models benefit from gpu impls of sort but yup they're not nearly as common/important as spmm/spmv

m-schuetz · 20m ago

Is that relevant for 4x4 multiplications? Because at least for me, radix sort is way more important than multiplying matrices beyond 4x4. E.g. for Gaussian Splatting.

Google will allow only apps from verified developers to be installed on Android (9to5google.com)

Gemini 2.5 Flash Image (developers.googleblog.com)

FFmpeg 8.0 (ffmpeg.org)

What are OKLCH colors? (jakub.kr)

Dissecting the Apple M1 GPU, the end (rosenzweig.io)

A German ISP changed their DNS to block my website (lina.sh)

Claude for Chrome (anthropic.com)

DeepSeek-v3.1 (api-docs.deepseek.com)

AI tooling must be disclosed for contributions (github.com)

Show HN: Base, an SQLite database editor for macOS (menial.co.uk)

A visual introduction to big O notation (samwho.dev)

Comet AI browser can get prompt injected from any site, drain your bank account (twitter.com)

Go is still not good (blog.habets.se)

We regret but have to temporary suspend the shipments to USA (olimex.wordpress.com)

U.S. government takes 10% stake in Intel (cnbc.com)

Waymo granted permit to begin testing in New York City (cnbc.com)

Ban me at the IP level if you don't like me (boston.conman.org)

Monodraw (monodraw.helftone.com)

Google has eliminated 35% of managers overseeing small teams in past year (cnbc.com)

Michigan Supreme Court: Unrestricted phone searches violate Fourth Amendment (reclaimthenet.org)

Scientist exposes anti-wind groups as oil-funded, now they want to silence him (electrek.co)

US Intel (stratechery.com)

Altered states of consciousness induced by breathwork accompanied by music (journals.plos.org)

Unexpected productivity boost of Rust (lubeno.dev)

Io_uring, kTLS and Rust for zero syscall HTTPS server (blog.habets.se)

Nx compromised: malware uses Claude code CLI to explore the filesystem (semgrep.dev)

How to build a coding agent (ghuntley.com)

What makes Claude Code so damn good (minusx.ai)

Framework Laptop 16 (frame.work)

Line scan camera image processing for train photography (daniel.lawrence.lu)

Building the mouse Logitech won't make (samwilkinson.io)

Proposal to Ban Ghost Jobs (cnbc.com)

The Therac-25 Incident (2021) (thedailywtf.com)

The GitHub website is slow on Safari (github.com)

Ask HN: Why hasn't x86 caught up with Apple M series?

Malicious versions of Nx and some supporting plugins were published (github.com)

A teen was suicidal. ChatGPT was the friend he confided in (nytimes.com)

Manim: Animation engine for explanatory math videos (github.com)

Everything I know about good API design (seangoedecke.com)

We put a coding agent in a while loop (github.com)

95% of Companies See 'Zero Return' on $30B Generative AI Spend (thedailyadda.com)

Show HN: A zoomable, searchable archive of BYTE magazine (byte.tsundoku.io)

4chan will refuse to pay daily online safety fines, lawyer tells BBC (bbc.co.uk)

I Am An AI Hater (anthonymoser.github.io)

macOS 26 Tahoe's Dead Canary Utility App Icons (daringfireball.net)

Ghrc.io appears to be malicious (bmitch.net)

Google's Liquid Cooling (chipsandcheese.com)

Uncomfortable Questions About Android Developer Verification (commonsware.com)

A bug saved the company (weblog.rogueamoeba.com)

FCC bars providers for non-compliance with robocall protections (docs.fcc.gov)

GPUPrefixSums – state of the art GPU prefix sum algorithms

Comments (11)