Internet Search Is Not a Naive Information Retrieval Problem

7 deontology 2 5/17/2025, 3:10:03 AM gojiberries.io ↗

Comments (2)

anenefan · 19m ago
The last paragraph was the best lulz I've had all week -

>Real search engines don't primarily compete on finding relevant documents. They compete on resisting manipulation. The moment Google's algorithm became valuable, an entire industry emerged dedicated to gaming it. Every ranking factor becomes a target for optimization, spam, and abuse. Search engines spend enormous resources not just on relevance, but on detecting artificial link schemes, content farms, cloaked pages, and sophisticated manipulation tactics that evolve daily.

This certainly differed considerably with my reality as it ebbed towards the mid 10's. Google back then were happy enough to provide 100 results per page, and I typically would hunt though around 10 pages of results when expanding each keyword query set to hunt down what a user wanted. Each angle of looking for the needle, the initial keyword query generally needed to be modified a number of times to trim away the should-be-easy-to-identify-as-BS-sites which Google seemed totally unable to filter out and actually crowded out real results. No I'd say google was when I last used it earnestly, it was all about generating revenue from clicks, but not in an entirely obvious manner.

A site getting google's attention is probably even more critical now - it's been a long when I've seen more than 10 pages results from Google via a particular keyword query, and it's only willing to serve me 10 results per page, so less than 100 results in total is normal now - scary that back 10 years ago in a much smaller web a great multitude of results from google were available.

patrickhogan1 · 3h ago
I agree with you theoretically. But this is a situation where LLMs are far better at surfacing relevant results than Google. Perhaps due to perverse incentives. Google might fight spam but seems to have started losing that battle a few years ago when it optimized for search quantity over quality.