LibRedirect – Redirects popular sites to alternative privacy-friendly frontends (libredirect.github.io)

There's not much incentive to subsidize prices for OpenRouter providers for example, and the prices are much lower than the $6.37/M estimate from the article.

https://openrouter.ai/meta-llama/llama-3.3-70b-instruct

avg $0.37/M input tokens, $0.73/M output tokens (21 providers)

Llama is not even a good example, as the recent models are more optimized using Mixture Of Experts and KV cache compression.

apsec112 · 3h ago

This ignores batching - token generation is much more efficient in batch - and I strongly suspect is itself written by AI, given the heavy use of bullets

twoodfin · 1h ago

The “X—not Y” pattern is also a dead giveaway.

biophysboy · 2h ago

is it common for adjacent tokens to use the same weights in a memory cache?

PaulHoule · 4h ago

When the AI hype train left the station I said "we don't understand how these things work at all and they're going to get much cheaper to run" and that turned out to be... true.

Already vendors of legacy models like ChatGPT-4 have to subsidize inference to keep up with new entrants based on a better foundation. It's likely that inference costs can be brought down by another factor of ten or so so of course you have to 90% subsidize these to get where the industry will be in 2-3 years.

impure · 1h ago

I’ve been playing around with Gemma E4B and have gotten really good results. That’s a model you can run on a phone. So although prices have been going up recently I suspect they will start to fall again soon.

mrtksn · 3h ago

Subsidized is probably not the correct word here, it's probably more like loss leader in the race of the land grab.

It's like the early days of the internet when everything was amazing and all the people who put money into this thing were "losing" their money.

It's going to be like this until monopolization and moat becomes defensible and then they will enshittify the crap of it and make their money back 10x, 100x etc.

GaggiX · 3h ago

This calculation doesn't account for batches, it makes no sense.

BriggyDwiggs42 · 2h ago

On average how much does batching bring costs down?

GaggiX · 1h ago

It balances the computing and memory bandwidth bottleneck so by a lot, with continuous batching you can easily see a x10, x20 or more.

revskill · 3h ago

No lol. The quality is mostly bad. Basically u need to prompt in detail like writing a novel for llm to understand. At that price, we want real AI who can really have common sense, not just an autocompletion tool.

Stop adverting LLM as AI, instead sell it as a superior copy & paste engine.

What's worst about LLM, is the more you talk with it, the worse it became to the point of broken.

Mechanical Watch: Exploded View (fellerts.no)

I wrote my PhD Thesis in Typst (fransskarman.com)

Using Home Assistant, adguard home and an $8 smart outlet to avoid brain rot (romanklasen.com)

Cross-Account and Cross-Region Backups with AWS Backup (and Friends) (tylerrussell.dev)

Finding a billion factorials in 60 ms with SIMD (codeforces.com)

Git Notes: Git's coolest, most unloved­ feature (2022) (tylercipriani.com)

San Francisco before the Tech industry (laphamsquarterly.org)

Klein Bottle Amazon Brand Hijacking (kleinbottle.com)

Kastle (S24) is hiring an engineer (ycombinator.com)

Radio Garden (radio.garden)

2048 with only 64 bits of state (github.com)

LibRedirect – Redirects popular sites to alternative privacy-friendly frontends (libredirect.github.io)

We’ve had a Denisovan skull since the 1930s—only nobody knew (arstechnica.com)

Children in England growing up 'sedentary, scrolling and alone', say experts (theguardian.com)

How to negotiate your salary package (complexsystemspodcast.com)

FreeBSD Kernel Modules Pkg(8) Repositories (vermaden.wordpress.com)

Hawaii Highways (hawaiihighways.com)

Show HN: Turn a paper's DOI into its full reference list (BibTeX/RIS, etc.) (references.mireklzicar.com)

The cultural decline of literary fiction (oyyy.substack.com)

TPU Deep Dive (henryhmko.github.io)

I was surprised by how simple an allocator is (tgmatos.github.io)

Why do all browsers' user agents start with "Mozilla/"? (2008) (stackoverflow.com)

Dev jobs are about to get a hard reset and nobody's ready (old.reddit.com)

Using an $8 smart outlet to avoid brainrot (neilchen.co)

Kilauea volcano errupts, lava more than 1k feet high [video] (youtube.com)

How fast are Linux pipes anyway? (mazzo.li)

Dynamic YAML with Python computed properties for fusing API workflows and SQL (sequor.dev)

There's Gold in the Hills (longreads.com)

Low-Temperature Additive Manufacturing of Glass (ll.mit.edu)

Mbake – A Makefile formatter and linter, that only took 50 years (github.com)

Remote MCP Support in Claude Code (anthropic.com)

Show HN: Luna Rail – Treating night trains as a spatial optimization problem (luna-rail.com)

Sound As Pure Form: Music Language Inspired by Supercollider, APL, and Forth (github.com)

P-Hacking in Startups (briefer.cloud)

What would happen if you tried to land on a gas giant? (popsci.com)

Otus Lisp (yuriy-chumak.github.io)

Largest Wildlife Bridge Spanning 10 Lanes of CA 101 Is Nearly Complete (thedrive.com)

Show HN: A Tool to Summarize Kenya's Parliament with Rust, Whisper, and LLMs (github.com)

The Void IDE, Open-Source Alternative to Cursor, Released in Beta (infoq.com)

Show HN: I'm a doctor and built a responsive breathing app for anxiety and sleep (apps.apple.com)

USAF B-2 Spirit Bombers Have Beds (simpleflying.com)

LaborBerlin: State-of-the-Art 16mm Projector (filmlabs.org)

uBlock Origin Lite Beta for Safari iOS (testflight.apple.com)

GOP omnibus bill would sell off USPS's EVs (washingtonpost.com)

Airpass – Easily overcome WiFi time limits (airpass.tiagoalves.me)

Phoenix.new – Remote AI Runtime for Phoenix (fly.io)

Denmark's Archaeology Experiment Is Paying Off in Gold and Knowledge (scientificamerican.com)

Type Inference Zoo (zoo.cuichen.cc)

Harry Brearley, the creator of stainless steel (2016) (nautil.us)

Let's Talk About Writing in Tech (gmoniava.com)

AI API Prices are 90% Subsidized

Comments (11)

Git Notes: Git's coolest, most unloved feature (2022) (tylercipriani.com)