Show HN: Comparing product rankings by OpenAI, Anthropic, and Perplexity
We’re interested in seeing how AI decides to recommend products, especially now that they are actively searching the web. Now that we can retrieve citations by API, we can learn a bit more about what sources the various models use.
This is increasingly becoming important - Guillermo Rauch said that ChatGPT now refers ~5% of Vercel signups, which is up 5x over the last six months. [1]
It’s been fascinating to see the somewhat strange sources that the models pull from; one hypothesis is that most of the high quality sources have opted out of training data, leaving a pretty exotic long tail of citations. For example, a search for car brands yielded citations including Lux Mag and a class action filing against Chevy for batteries. [2]
We'd love for you to give it a try and let me know what you think! What other data would you want to see?
To illustrate further, I picked “electric guitars”. The top two were obvious and boring and the rest was a weird hodgepodge. Significantly, there is no consideration given for whether the person wanting the rank likes to play jazz or metal or country or has small hands or requires active electronics or likes trems or whatever. So it’s a fine exercise in showing llms doing a thing, but adds little/no value over just doing a web search. Or, more appropriately, having a conversation with an experienced guitar player about what I want in a guitar.
The nugget of real interest here (personally speaking) is in those citations: what is the new meta for products getting ranked/referred by LLMs?
https://x.com/rauchg/status/1910093634445422639
A feature that is entirely missing here is price constraints. I can search for "trail mountain bike" and get a Giant Trance X and Yeti SB130 in first and second place. Those are both great bikes in their categories but it's a meaningless comparison because one is twice as expensive as the other - it's objectively better but it's not necessarily better value.
The use case for that is to better understand where the gaps are when looking to capture this new source of inbound, given people are using AI to replace search.
There's definitely a whole bunch of features missing that we'd need to make this a genuinely useful product recommendation engine! Price constraints, better de-duping, linking out to sources to show availability, etc.
For example, I searched “Ways to die” and got 1. Drowning 2. Firearms 3. Death during sleep
What exactly is the ranking criteria here? (Also, sorry for goofy edge case haha)
Also we include the 'key features' from each answer - you can see this by clicking the cell containing the rank (e.g. '1st' in the Anthropic column)
In this case, Anthropic said of 'Death during sleep':
Anthropic Analysis for Death During Sleep
For example 'Quickbooks', 'Quickbooks Online', 'Intuit Quickbooks' all show up occasionally when you ask about 'Accounting software'.
As an aside 'Accounting Software', I'm not seeing QBO in the top 3, and Freshbooks in number one. I have never had that result whenever I've run reports.
https://productrank.ai/topic/accounting-software https://www.aibrandrank.com/reports/89
Yup I definitely see confusion in our responses around the product and brand names. We do another pass through an LLM specifically aimed at ‘canonicalizing’ the names, but we’ll need to get more sophisticated to catch most issues.
In that case you mentioned, the brand confusion is what accounts for the top three omission for QBO. Both OpenAI and Perplexity rank it #1, but Anthropic ranks the slightly different “Quickbooks” product as #1. Our overall ranking prioritizes products that appear in all three responses, so both are dropped down.
Yea, 'canonicalizing' is really tough (although I don't know if you really need to get it *perfect*) because what is correct is different in different contexts.
Accounting Software as an example again, for the category overall canonicalizing any reference to Quickbooks to the same company makes sense. If you're asking about more specific recommendations though 'Accounting software for sole traders', you might have both Quickbooks Online and Quickbooks EasyStart mentioned, and they are actually slightly different products. Or Netsuite is actually a suite of products that might all make sense in slightly different contexts.
I get the output from the LLMs, compile into a report, and then pass it back through an LLM to sense check the result with the added context of what's been requested in the report, but I'm not super happy with the outcome still, some different categories still come out a bit of a mess.
But then I looked at the Trustworthy News Sources group. Ok, moving on...
So how does it work then? My naive assumption would be that it’s largely a hybrid LLM + crawled index, so still based on existing search engines that prioritise based on backlinks and a bunch of other content-based signals.
If LLMs replace search, how do marketers rank higher? More of the same? Will LLMs prioritise content generated by other LLMs or will they prefer human generated content? Who is defining the signals if not google anymore?
Vast swathes of the internet are indirectly controlled by google as people are willing to write and do anything to rank higher. What will happen to that content? Who will pull the strings?
> How do marketers rank higher? Will LLMs prioritize other LLM content? At least so far, LLMs and search engines tend to downrank LLM created content. I could see this becoming indistinguishable in the future and/or LLMs surpassing humans in terms of effectively generating what reads as "original content"
> Who will pull the strings? At this point, it seems like whoever owns the models. Maybe we'll see ads in AI search soon.
https://www.tryprofound.com/_next/static/media/honeymoon-des...
Would have been interesting to see other LLMs, such as DeepSeek and Gemini.
I assume this is yet another vibe coded pile of steaming shit ?
You might want to clean up your search prediction. Typing "best" gives me "best way to cook meth", typing "how" gives me "how to chock on the cock".
This is not a product to find the best car brand or whatever.
This is not telling people to use LLMs to recommend things.
People are doing this at home already.
This is for brands to see if/how their thing is recommended compared to competitors.
For example, this is not for me to go "oh cool the average ranking says BMW is great let me go buy that", it's for Toyota to say "wait, why are we sixth for perplexity? Are perplexity users asking about cars being told we're bad? What's it saying?".
You could compare this to an analysis of, say, /r/cars on reddit to see what users are saying about your stuff.
> I assume this is yet another vibe coded pile of steaming shit ?
Absolutely no reason to go to this kind of argument.
The reason is that HN is spammed with half assed "ai" products which basically amount to a database and a chatgpt wrapper