Picking the Best AI Model for Cost and Freshness (Lesson from Building My Site)

Comments (1)

alexcolewrites · 18h ago

Hey HN,

When building the AI game suggestion feature for steamid.one (a side project for Steam users), I ran into the common, but often under-discussed, problem of choosing the "right" AI model. My needs were simple: smart, fast, cheap, and aware of somewhat recent data without complex RAG.

Here’s how I weighed the options based on cost (per 1 Million tokens, approx.), knowledge cutoff, and performance for my use case (analyzing player profiles to suggest games and output structured JSON): 1. OpenAI (GPT): * gpt-3.5-turbo (0.5M context, cutoff early 2024): ~$0.50 input / $1.50 output. Decent baseline, but reasoning can sometimes miss nuance for creative suggestions. * gpt-4o (128K context, cutoff late 2023): ~$5.00 input / $15.00 output. Powerful, but too pricey for a free tool's scale.

2. Anthropic (Claude): * Claude 3 Haiku (200K context, cutoff early 2024): ~$0.25 input / $1.25 output. Extremely competitive on price and very capable.

3. Google (Gemini): * gemini-1.5-pro (128K context, cutoff early 2024): ~$3.50 input / $10.50 output. Solid, but more than I wanted to spend. * gemini-1.5-flash (128K context, cutoff early 2024): ~$0.35 input / $1.05 output. This was the winner.

Why Flash Stood Out for steamid.one:

Flash's performance for my specific JSON output needs (structuring game suggestions) combined with its unbeatable cost-effectiveness was the killer feature. For a free tool, literally every cent per call matters. Also, I found that for game suggestions based on user-provided data, the models' general knowledge cutoffs were less of a bottleneck than expected. The AI's strength was its reasoning with the data I fed it (player genres, owned games), not needing to know every new release. This significantly changed my prompt engineering strategy.

I'm curious: * How do you balance AI model cost, capability, and knowledge cutoff for your projects? * Any tips for cheap, reliable AI integrations?

Thanks for any feedback!

Native array_first() and array_last() Functions in PHP 8.5 (laravel-news.com)

Tandy Corporation, Part 3 Becoming IBM Compatible (abortretry.fail)

NeuroKit AI – Curated Directory of 1000 AI Tools and Model Data (neurokitai.com)

Personal Reflections on Immutable Linux (hackaday.com)

GPX Studio. Show, edit GPX files, route planning, file processing tools, MIT Lic (gpx.studio)

Key Event Receipt Infrastructure (KERI): A fully decentralized identity system (keri.one)

What Does It Mean to Orange-Pill Someone? (2022) (nasdaq.com)

Show HN: I built AI Email Writer – Let AI help you write better Emails (writemail.app)

Whimsical Animations (whimsy.joshwcomeau.com)

Show HN: Online Photo Gallery (newurgency.art)

Show HN: A decentralized command line key-value store on Nostr (github.com)

Huawei's Pura 80 roll-out in Dubai is latest move in global smartphone push (scmp.com)

Ask HN: How do I get traffic and distribution for my app cheaply?

Apache Fory Serialization Framework 0.11.2 Released (github.com)

Data Over Time – some tips on dealing with the challenges of temporal data (blog.julik.nl)

Ultra-thin bendy solar panels are so light you can wear them (cnn.com)

Convert a GitHub Markdown file to a pretty HTML CV (gist.github.com)

Awesome Fresh Developer Tools (github.com)

Hertz and Other Rental Car Agencies Turn to AI for Damage Detection (nytimes.com)

Woman takes 10x dose of turmeric, gets hospitalized for liver damage (arstechnica.com)

One Company Poisoned the Planet (youtube.com)

Croissant: Building a No-Framework Web App (dbushell.com)

A Stroke of Genius: Striving for Greatness in All You Do by R.W. Hamming (cs.utexas.edu)

Slack's 57MB 404 page (a.slack.com)

A Mind Is Born: 256 byte Commodore 64 demo (linusakesson.net)

Navigating AI in translation: Why human expertise still matters (gulf-times.com)

Show HN: CD Calculator – A tool to calculate bank CD interest returns (cd-calculator.net)

Exploiting Public App_key Leaks to Achieve RCE in Laravel Applications (blog.gitguardian.com)

Kimi-Dev-72B (huggingface.co)

A Practical Guide to Evaluating Large Language Models (LLM) (medium.com)

SEO, Logorrhoea and the Rise of Sick AI (purpleorca.co.uk)

Google strikes deal to buy fusion power from MIT spinoff Commonwealth Fusion Sys (reuters.com)

Claude Code/Cursor is using grep? Are we devolving

Guess a random number between 1 and 50

Defold editor scripting adds scene editing in 1.10.4 (defold.com)

Show HN: I made a simple iOS app to track and count my habits (apps.apple.com)

7GUIs in Mint (mint-lang.com)

Self-imposed ban – a lightweight bash script to block commands (github.com)

What happened to XProtect this week? (eclecticlight.co)

Engineers develop fire extinguisher that puts out fire with sound (2023) (inspenet.com)

Screen recording of working with Cursor [video] (youtube.com)

Show HN: AI Movie Finder – I created a way to find movies by describing (aimoviefinder.com)

Wrapping Go errors with caller info (dizzy.zone)

Crates.io: Development Update (blog.rust-lang.org)

The Bitter Lesson (2025) (artfintel.com)

Why Is Fertility So Low in High Income Countries? (NBER) (nber.org)

Secure Your Keys with Keyand.me (thasso.xyz)

Jaws: The Text Adventure (mattround.com)

More Views on Curl Vulnerabilities (daniel.haxx.se)

Accessing new kernel features from Python (lwn.net)

Picking the Best AI Model for Cost and Freshness (Lesson from Building My Site)

Comments (1)