Show HN: Tool I made for planning meals using produce that is in season near you (seasonalmealplanner.com)

What people seem to miss very hard is that they get interactive chat mode of all the models, including the best and newest (Gemini 2.5 Pro, 2.5 Flash, 2.5 Flash Lite and older) totally for free. I mean when working from chat at https://aistudio.google.com/ the entire 1M context window and all is totally free of charge. You really get a very good AI for nothing.

https://i.imgur.com/pgfRrZY.png

7thpower · 12h ago

Funny you mention this, I literally just got done loading the context window of AI studio up for an hour doing some prototyping and then was frustrated when I couldn’t see where I was at from billing (knew it couldn’t be that much, but I still like to know).

I assumed because I’m on paid tiers it would still cost behind a certain usage amount, but I guess not.

cma · 16h ago

Can you opt out of them training on your data in that free tier?

relatedtitle · 16h ago

If you have cloud billing enabled you can still use it for free and they say they don't train on it. https://ai.google.dev/gemini-api/docs/billing#paid-api-ai-st...

matesz · 15h ago

Geminis free tier allows maybe 5 messages on average, for 2.5 pro at least and this is not usable.

I’m using Claude Pro for daily driver and Gemini / ChatGPT free tiers.

rat9988 · 15h ago

> Geminis free tier allows maybe 5 messages on average, for 2.5 pro at least and this is not usable.

Not on ai studio.

matesz · 7h ago

Oh my... I didn't know about Gemini Studio and didn't expect the possibility of it existing. Thanks for correcting!

HackerThemAll · 15h ago

You are clearly confirming my comment above.

thomastjeffery · 13h ago

How?

ratg13 · 13h ago

Read the text, click the links, let it sink in

thomastjeffery · 12h ago

I did that, and I assume GP did as well.

There is some information that you assume to have shared that we are not picking up on.

what_ever · 6h ago

May be ask your favorite AI about what you are missing. Or may be ask using AI studio as that won't rate limit you ;)

dang · 18h ago

Related ongoing thread:

Claude Sonnet 4 now supports 1M tokens of context - https://news.ycombinator.com/item?id=44878147 - Aug 2025 (160 comments)

irthomasthomas · 19h ago

So sonnet-4 is faster than gemini-2.5-flash at long context. That is surprising. Especially since Gemini runs on those fast TPUS.

curl-up · 18h ago

Note that (in the first test, the only one where output length is reported), Gemini Pro returned more than 3x the amount of text, at less than 2x the amount of time. From my experience with Gemini, that time was probably mainly spent on thinking, length of which is not reported here. So looking at pure TPS of output, Gemini is faster, but without clear info on the thinking time/length, it's impossible to judge.

jbellis · 18h ago

if they left them both on defaults, flash is thinking-by-default and sonnet 4 is no-thinking-by-default

bitpush · 18h ago

> Claude’s overall response was consistently around 500 words—Flash and Pro delivered 3,372 and 1,591 words by contrast.

It isnt clear from the article whether the time they quote is time-to-first-token or time to completion. If it is latter, then it makes sense why gemini* would take longer even with similar token throughput.

netdur · 14h ago

output tokens must be generated in order (autoregressive decoding), inputs don’t have that constraint, so prefill is parallel, with stronger kernels, KV-cache handling, and batching, Claude can outrun Gemini.

lugao · 18h ago

Anthropic also uses TPUs for inference.

irthomasthomas · 17h ago

Do they rent them from Google? Or are they a different brand?

ancientworldnow · 15h ago

Google provides them.

irthomasthomas · 4h ago

Ah cool I'll have to read up on that, I had thought that google was hoarding them.

ozbonus · 36m ago

Mess o youxwh to yt h!

arnaudsm · 19h ago

https://archive.is/sb7D5

thefourthchime · 19h ago

Does anyone else have trouble with the archive rendering of that? It seemed to also have the pop up.

sebastienbarre · 19h ago

You can delete the div with id=subscribe-popup from the dev tools for a better view.

skarz · 18h ago

Try one of these. They have the popup but you can dismiss it.

https://ghostarchive.org/archive/JlE5T

https://web.archive.org/web/20250812172455/https://every.to/...

akomtu · 17h ago

IMO, a good contest between LLMs would be data compression. Each LLM is given the same pile of text, and then asked to create compact notes that fit into N pages of text. Then the original text is replaced with their notes and they need to answer a bunch of questions about the original text using the notes alone.

rafaelmn · 6h ago

Summarization ? I'm pretty sure there are benchmarks for this because people used summarization to build search indexes (at least a few years ago when I was working on this they did and there were benchmarks)

daft_pink · 18h ago

i’m really curious how well they perform with a long chat history. i find that gemini often gets confused when the context is long enough and starts responding to prior prompts, using the cli or it’s gem chat window.

XenophileJKO · 18h ago

From my experience. Gemini is REALLY bad about context blending. It can't keep track of what I said and what it said in a conversation under 200K tokens. It blends concepts and statements up, then refers to some fabricated hybrid fact or comment.

Gemini has done this in ways that I haven't seen in the recent or current generation models from OpenAI or Anthropic.

It really surprised me that Gemini performs so well in multi-turn benchmarks, given that tendency.

IanCal · 17h ago

I’ve not experimented with the recent models for this but older Gemini models were awful for this - they’d lie about what I’d said or what was in their system prompt even with short conversations.

koakuma-chan · 18h ago

I really doubt you can fit all Harry Potter books in 1M tokens.

PeterStuer · 18h ago

The series is 1,084,170 words. At let's say 1.4 tokens per word, this would not fit, but it is getting close.

magicalhippo · 15h ago

How do they do if you test[1] them for attention deficit disorder?

[1]: https://www.imdb.com/title/tt0766092/quotes/?item=qt1440870

koakuma-chan · 17h ago

It's 2M tokens for Gemini.

chrismustcode · 15h ago

That was previous iterations, 2.5 is 1 million context window

https://ai.google.dev/gemini-api/docs/models (context window is details under model variant section with + signs)

They were meant to crank 2.5 to 2 million at some point though, maybe waiting now till 3?

bredren · 12h ago

Maybe consuming the resources internally.

koakuma-chan · 13h ago

I mean the Harry Potter books are 2M tokens.

gcr · 18h ago

The entire HP series is about one million words.

koakuma-chan · 18h ago

Harry Potter and the Order of Phoenix alone is 400K tokens.

kridsdale3 · 14h ago

And takes up a proportional width of everyone's bookshelves along side the others.

llm_nerd · 12h ago

Curious, I found an epub, converted it to a txt, and dumped it into the Qwen3 tokenizer. It yielded 359,088 tokens, end to end.

Using the GPT-4 tokenizer (cl100k_base) yields 349,371 tokens.

Recent Google and Anthropic models do not have local tokenizers and ridiculously make you call their APIs to do it, so no idea about those.

Just thought that was interesting.

Show HN: Building a web search engine from scratch with 3B neural embeddings (blog.wilsonl.in)

Show HN: Omnara – Run Claude Code from anywhere (github.com)

Show HN: Doom port to pure Go – Gore (github.com)

Show HN: Play Pokémon to unlock your Wayland session (github.com)

Show HN: langdiff – Stream valid JSON from LLMs with type-safe callbacks (github.com)

Show HN: Move to dodge the bullets. How long can you survive? (dodge.trickle.host)

Show HN: I built an offline, open‑source desktop Pixel Art Editor in Python (github.com)

Show HN: From CRUD Dev to AI Founder in 30 Days – Photor.ai (photor.ai)

Show HN: Turn your iPhone into a local OCR server using Vision Framework (github.com)

Show HN: A fun side project on chivalry and virtues (chivalrytest.online)

Show HN: A Sinclair ZX81 retro web assembler+simulator

Show HN: The current sky at your approximate location, as a CSS gradient (sky.dlazaro.ca)

Show HN: Engineering.fyi – Search across tech engineering blogs in one place (engineering.fyi)

Show HN: ServerBuddy – GUI SSH client for managing Linux servers from macOS (serverbuddy.app)

Show HN: I built LMArena for Motion Graphics (graphicarena-1.onrender.com)

Show HN: Nocturne – Your Car Thing's Second Chapter (usenocturne.com)

Show HN: Build agents directly in your notes and tables (useportals.dev)

Show HN: Bolt – A super-fast, statically-typed scripting language written in C (github.com)

Show HN: Free SVG Icons – Browse, customize, and grab icons (iconshelf.com)

Show HN: Keeps – Mail a postcard that plays your voice (sendkeeps.com)

Show HN: Tool I made for planning meals using produce that is in season near you (seasonalmealplanner.com)

Show HN: I built a desktop app that indexes your media locally (meetcosmos.com)

Show HN: Minimal Claude-Powered Bookmark Manager (tryeyeball.com)

Show HN: I implemented a RNN from scratch by reading a dense neural network book (github.com)

Show HN: Created 60 free useful tools in one place (kewltools.com)

Show HN: Joinable's RAG-in-a-Box – fastest way to build a RAG App for your data (joinable.ai)

Show HN: Browser AI agent platform designed for reliability (github.com)

Show HN: HackerNewsGames [Alpha] (hackernews.games)

Show HN: An endless feed with history, science, tech., business (reelly.app)

Show HN: I built a visual AI workflow builder because debugging prompts is hard (chainix.ai)

Show HN: Trayce – Burp Suite for developers (trayce.dev)

Show HN: Stasher – Burn-after-read secrets from the CLI, no server, no trust (github.com)

Show HN: Octofriend, a cute coding agent that can swap between GPT-5 and Claude (github.com)

Show HN: Sinkzone DNS – Forwarder that blocks everything except your allowlist (github.com)

Show HN: I built a browser AI to use GPT‑OSS locally (no server) (github.com)

Show HN: Sarpro – Fast Sentinel-1 SAR GRD → GeoTIFF/JPEG in Rust (github.com)

Show HN: QuickShelf – Stop opening Finder just to drag files (quickshelf-app.slowlab.dev)

Show HN: I integrated Ollama into Excel to run local LLMs (pythonandvba.com)

Show HN: Synchrotron, a real-time DSP engine in pure Python (synchrotron.thatother.dev)

Show HN: A Choose-Your-Own-Adventure Constructed by Claude Code (github.com)

Show HN: ToDiagram AI – From text to diagram, fast and easy (todiagram.com)

Show HN: Snape, a Minimal Snippet Manager Built in Go (github.com)

Show HN: A reading to remind us to keep raising our voices against oppression (childrensbookforall.org)

Show HN: I built an app that uses math to find restaurants nearby the sweet spot (settld.space)

Show HN: KARMA – An evaluation framework for Medical AI systems (karma.eka.care)

Show HN: Resume Vibe Check (vibecheck.joecooper.me)

Show HN: Open-source protocol for secure tool-calling [Technical Specification] (utcp.io)

Show HN: AskPrisma – Multi-agent AI that can replace a junior data analyst (askprisma.ai)

Show HN: An open-source email archiver with full-text search capabilities (openarchiver.com)

Show HN: When is the next Caltrain? (minimal webapp) (erikschluntz.com)

Claude vs. Gemini: Testing on 1M Tokens of Context

Comments (43)