Meta as a company always acts like the most unprofessional idiots if they feel under pressure. They have all the time and ressources to do it right and they never do.
emot · 1d ago
isn't this crawler to generate previews on Facebook? they have others for training and AI stuff. oh well, one never knows with Meta...
——
The facebookexternalhit/1.1 user agent you're seeing in the logs is a Facebook crawler, specifically used by Facebook’s servers to fetch content (like Open Graph metadata) when:
Someone shares a link on Facebook or Messenger
Facebook needs to generate a preview (title, image, description) for that URL
N19PEDL2 · 1d ago
Genuine question: how do they know the bot is from Facebook, apart from what's written in the user agent?
extraduder_ire · 1d ago
They're cropped off to the side, but I assume the IPs making those requests are in a block owned by facebook.
jedisct1 · 1d ago
Interestingly, they rotate user agents.
A few days ago they were identifying as “FaceBot” (not "FacebookBot").
When that began to be blocked, they switched to reusing the “Facebookexternalhit” user agent they also use for redirects; one people are less likely to block.
dmitrygr · 1d ago
that’s not interesting. That’s borderline fraud. Once someone gets a huge bill they’ll have standing to sue, and should.
philipallstar · 1d ago
What's "borderline fraud"? Not fraud?
bediger4000 · 1d ago
Would a 404 or a 403 be more appropriate? What if you just want Meta crawlers to go away forever?
—— The facebookexternalhit/1.1 user agent you're seeing in the logs is a Facebook crawler, specifically used by Facebook’s servers to fetch content (like Open Graph metadata) when:
Someone shares a link on Facebook or Messenger
Facebook needs to generate a preview (title, image, description) for that URL
A few days ago they were identifying as “FaceBot” (not "FacebookBot").
When that began to be blocked, they switched to reusing the “Facebookexternalhit” user agent they also use for redirects; one people are less likely to block.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...