Ask HN: What are you actually using LLMs for in production?
44 Satam 55 6/28/2025, 2:46:20 PM
Beyond the obvious chatbots and coding copilots, curious what people are actually shipping with LLMs. Internal tools? Customer-facing features? Any economically useful agents out there in the wild?
I did some really basic napkin math with some Rails logs. One request with some extra junk in it was about 400 tokens according to the OpenAI tokenizer[0]. 500M/400 = ~1.25 million log lines.
Paying linearly for logs at $20 per 1.25 million lines is not reasonable for mid-to-high scale tech environments.
I think this would be sufficient if a 'firehose of data' is a bunch of news/media/content feeds that needs to be summarized/parsed/guessed at.
[0] https://platform.openai.com/tokenizer
$20 could cover half a billion tokens with those models! That's a lot of firehose.
We scrape job sites and use that prompt to create tags which are then searchable by users in our interface.
It was a bit surprising to see how Karpathy described software 3.0 in his recent presentation because that's exactly what we're doing with that prompt.
Software 2.0: We need to parse a bunch of different job ads. We'll have a rule engine, decide based on keywords what to return, do some filtering, maybe even semantic similarity to descriptions we know match with a certain position, and so on
Software 3.0: We need to parse a bunch of different job ads. Create a system prompt that says "You are a job description parser. Based on the user message, return a JSON structure with title, description, salary-range, company, position, experience-level" and etc, pass it the JSON schema of the structure you want and you have a parser that is slow, sometimes incorrect but (most likely) covers much broader range than your Software 2.0 parser.
Of course, this is wildly simplified and doesn't include everything, but that's the difference Karpathy is trying to highlight. Instead of programming those rules for the parser ourselves, you "program" the LLM via prompts to do that thing.
It processes Steam game reviews and provides one page summary of what people thing about the game. Have been gradually improving it and adding some features from community feedback. Has been good fun.
What I found interesting with Vaporlens is that it surfaces things that people think about the game - and if you find games where you like all the positives and don't mind largest negatives (because those are very often very subjective) - you're in a for a pretty good time.
It's also quite amusing to me that using fairly basic vector similarity on points text resulted in a pretty decent "similar games" section :D
ATM I use ChatGPT Plus for everything except coding inside my Jetbrains IDEs.
I'm starting to look around at other LLMs for non-coding purposes (brainstorming, docs, being a project manager, summarizing, learning new subjects, etc.).
People use it to generate meeting notes. I don't like it and don't use it.
You used to either budget for data entry or just graft directories in a really ugly way. The forest used to know about 12000 unique access roles and now there are only around 170.
We're delivering confusion and thanks to LLMs we're 30% more efficient doing it
[1] https://www.fisherloop.com/en/
I have a js-to-video service (open source sdk, WIP) [1] with the classic "editor to the left - preview on the right" scenario.
To help write the template code I have a simple prompt input + api that takes the llms-full.txt [2] + code + instructions and gives me back updated code.
It's more "write this stuff for me" than vibe-coding, as it isn't conversational for now.
I've not been bullish on ai coding so far, but this "hybrid" solution is perfect for this particular use-case IMHO.
[1] https://js2video.com/play [2] https://js2video.com/llms-full.txt
If everyone is using it now prompts aren’t a good gauge.
If all you've built is RAG apps up to this point, I highly recommend playing with some LLM-in-a-loop-with-tools reasoning agents. Totally new playing field.
That said, it required the user to sign in with their real work email or the results are way off.
For example, I wrote a recent blog post on how I use LLMs to generate excel files with a prompt (less about the actual product and more about how to improve outcomes): https://maxirwin.com/articles/persona-enriched-prompting/
Getting those events onto a usable, sharable calendar is much easier now.
https://apps.apple.com/us/app/forceai-ai-workout-generator/i...
Pretty much 5-6 niche classification use cases.
Used it to deeper understand complex code base, create system design architecture diagrams and help onboard new engineers.
Summarizing large data dumps that users were frustrated with.
2. I build REPLs into any manual workflow that makes use of LLMs. Instead of just being like "F@ck, it didn't work!" you can instead tell the LLM why it didn't work and help it get the right answer. Saves a ton of time.
3. Coming up with color palettes, themes, and ideas for "content". LLMs are really good at pumping out good looking input for whatever factory you have built.