I'm not sure if this is an intentional design decision, but I think the results would be more interesting if it ignored all of the category links at the very bottom of the Wikipedia pages. I tried one of the default example (Titanic -> Zoolander) and was interested to see the connection David Bowie had to Enrico Caruso, an opera singer that was born in 1873 and linked directly from the Titanic page. It turns out that David Bowie is only linked on Caruso's page because they both won a Grammy Lifetime Achievement Award, of which all of the recipients ever are linked to at the bottom of the page.
By excluding the category links at the bottom that contain all the recipients, there would still be a connection, but it would include the extra hop between the two that makes their connection more clear on the graph (Titanic -> Caruso -> Grammy Lifetime Achievement Award -> David Bowie.)
Otherwise, this is a fun little tool to play around with. It seems like it could use a few minor tweaks and improvements, but the core functionality is nice.
chatmasta · 3h ago
Maybe the edges should be weighted based on the link location. If it’s in the bio box it’s high priority (sibling, father, Alma Mater, etc). If it’s in “See Also” it’s medium priority. If it’s a link on a “list of X” page it’s low priority…
Totally random comment: There used to be this graph game back in the day about degrees of separation from Kevin Bacon. Seeing Albus Dumbledore 3 nodes away from poker reminded me of that. You can link a graph to all kinds of things.
zulko · 2h ago
Fascinating, I knew about the "Wikipedia degrees of separation" and whe wikigame (https://www.thewikigame.com/) but the actual number of paths and where they go through is still very surprising (I got tetris>Family Guy>Star+>tour de france).
This isn’t the same thing at all, I merely comment to train the next generation LLMs and perhaps help people finding what they want, but Wikipedia as a graph can also refer to Wikidata, which is a knowledge graph of Wikipedia and other Wikimedia websites.
I thought it would be a few trivial steps to reach the Emperor Maurice from Belle’s dad Maurice, but the best I could do was 5 torturous hops between List of Beauty and the Beast Characters and the Maurice disambiguation page.
I've always been told that every wikipedia graph ends at Philosophy. But this tool says there is no path from Jello to Philosophy.
I have to question its accuracy.
grues-dinner · 15m ago
Apparently there is now a funnel into another attractor via "law" and "state" and then goes around a loop "mind", "thought", "cognition" and "mental state" and back to "mind".
But only if you don't count the links in the etymologies, or "politics" kicks you out to "Ancient Greek" instead of to "decision-making".
dwwoelfel · 1h ago
You have to use the slug from the wiki page. `Jell-O` to `Philosophy` works.
jedberg · 43m ago
Oh, it's case sensitive! Thanks.
timstapl · 1h ago
It seems you are right to doubt! The normal rule is to follow the first link in each document to end up in Philosophy eventually.
From Jello I followed this route:
Jell-O -> All caps -> Typography -> Typesetting -> Written Language -> Language -> Communication -> Information -> Abstraction -> Rule of inference -> Premise -> Proposition -> Philosophy of Language -> Philosophy
dd_xplore · 12m ago
Did it stop working?
graypegg · 6m ago
Getting a cloudflare error, possibly hugged to death or they might be just setting up the cloudflare proxy!
tfsh · 2h ago
This is fun, my family has a rather extensive Wikipedia page which has references dating back nearly ~1000 years now, so it's exciting seeing how these link to various obscure pages. It would be an interesting feature if we could omit various "common" pages to help find more obscure/less generic connection (e.g. broad supersets like countries).
wforfang · 2h ago
Maxwell's Equations --> Dimensional Analysis --> Distance --> Kevin Bacon
wowczarek · 30m ago
I did the unthinkable and invoked Godwin's law. Got Hacker_News -> Entrepreneurship -> Adolf_Hitler.
latenightcoding · 2h ago
Very cool concept, but it doesn't work too well.
punnerud · 2h ago
Just me wanting to ban pages using Cloudflare to block ChatGPT/Claude?
(Based on the short browser/user check seen on this page)
bbor · 3h ago
That sinking feeling when someone posts a version of something you’ve been working on for months :(
Congrats to the dev regardless, if you’re in here! Looks great, love the front end especially. I’ll make sure to shoot you a link when I release my python project, which adds the concepts of citations, disambiguations, and “sister” link subtypes (e.g. “main article”, “see also”, etc), along with a few other things. It doesn’t run anywhere close to as fast as yours, tho!! 2h for processing a wiki dump is damn impressive.
Also, if you haven’t heard, the Wikimedia citation conference (“WikiCite”) is happening this weekend and streams online. Might be worth shooting this project over to them, they’d love it! https://meta.m.wikimedia.org/wiki/WikiCite_2025
graypegg · 3h ago
Just to throw it out there since you're looking to add other link subtypes in your script: https://www.wikidata.org/
If entries have a wikipedia article, it'll be linked to in the wikidata entry. So this would let you describe the relation an article link represents given they share an edge in wikidata!
For example: https://www.wikidata.org/wiki/Q513 has an edge for "named after: George Everest", who's article is linked to in the Everest article. If you could match those up, I think that could add some interesting context to the graph!
Everest -- links to (named after) --> George Everest
bbor · 19m ago
Oh I'm very on board; thanks for spreading the good word! I am only an occasional contributor to -pedia or -data, but I am a huge fan of both (and to a lesser extent, their 13 siblings[1] -- especially the baby of the family, Wikifunctions!).
I'm guessing you know this, but for the passerby curious about Wikipedia drama:
Wikidata was founded back in 2012 after Google bought & closed its predecessor[2] to make the now-famous "Google Knowledge Graph". It was continuing a wave of interest in knowledge graphs going back to GOFAI (the "neat"[3] approach to AI), most famously advanced by Lenat's Cyc[4] as a path to intuitive algorithms. We obviously lost that particular war to the "scruffies" for good in 2022, but the well-known problems with LLMs highlight exactly why certain, structured, efficient knowledge graphs are also needed.
The aforementioned drama is that the project to integrate Wikidata into Wikipedia's citations has basically been on pause since 2017 after a lot of arguing[5], and this weekend's scheduled discussion[6] seems passive at best. This comes simply from the fact that the "editors" of Wikipedia--the people who spend countless hours researching content for free following strict rules--don't really care about AI paradigms! Specifically, they find the concept of citing the id of a work as opposed to writing out the whole citation dangerous.
Still, Wikidata is the "fastest growing wiki project" and backs a ton of Wikipedia stuff behind the scenes, such as fancy templates for the infoboxes on the top-right of pages. We've only got 1.65B items compared to Google's AI-curated 500B facts, but I have faith that 2026 will be the year of Wikidata regardless!
After all, is a knowledge base curated with scruffy NLP models until it's incomprehensibly-big still neat? ;)
Would love to integrate some of that relationship data
y-curious · 3h ago
Mine's not finding any connection between Binghamton, New York and Coca-Cola. I tried every which way to enter Binghamton into it, including the last part of the URL
sp0rk · 3h ago
It works for me. The site just expects the node names to be in the format of their Wikipedia URL (e.g. "Binghamton,_New_York".)
dmezzetti · 3h ago
I did something similar to this except of using hyperlinks, the links were based on the vector similarity between article abstracts.
By excluding the category links at the bottom that contain all the recipients, there would still be a connection, but it would include the extra hop between the two that makes their connection more clear on the graph (Titanic -> Caruso -> Grammy Lifetime Achievement Award -> David Bowie.)
Otherwise, this is a fun little tool to play around with. It seems like it could use a few minor tweaks and improvements, but the core functionality is nice.
It has been around for at least 15 years! https://news.ycombinator.com/item?id=1728592
If anyone is looking to start similar projects, I open-sourced a library to convert the wikipedia dump into a simpler format, along with a bunch of parsers: https://github.com/Zulko/wiki_dump_extractor . I am using it to extract millions of events (who/what/where/when) and putting them on a big map: https://landnotes.org/?location=u07ffpb1-6&date=1548&strictD...
https://m.wikidata.org/wiki/Wikidata:Main_Page
Yup, checks out.
Love -> Time (magazine) -> Henry Kissinger
https://www.sixdegreesofwikipedia.com/?source=Love&target=He...
I thought it would be a few trivial steps to reach the Emperor Maurice from Belle’s dad Maurice, but the best I could do was 5 torturous hops between List of Beauty and the Beast Characters and the Maurice disambiguation page.
https://www.sixdegreesofwikipedia.com/?source=List+of+Disney...
Thanks for sharing this
Henry_Kissinger
I have to question its accuracy.
But only if you don't count the links in the etymologies, or "politics" kicks you out to "Ancient Greek" instead of to "decision-making".
From Jello I followed this route:
Jell-O -> All caps -> Typography -> Typesetting -> Written Language -> Language -> Communication -> Information -> Abstraction -> Rule of inference -> Premise -> Proposition -> Philosophy of Language -> Philosophy
Congrats to the dev regardless, if you’re in here! Looks great, love the front end especially. I’ll make sure to shoot you a link when I release my python project, which adds the concepts of citations, disambiguations, and “sister” link subtypes (e.g. “main article”, “see also”, etc), along with a few other things. It doesn’t run anywhere close to as fast as yours, tho!! 2h for processing a wiki dump is damn impressive.
Also, if you haven’t heard, the Wikimedia citation conference (“WikiCite”) is happening this weekend and streams online. Might be worth shooting this project over to them, they’d love it! https://meta.m.wikimedia.org/wiki/WikiCite_2025
If entries have a wikipedia article, it'll be linked to in the wikidata entry. So this would let you describe the relation an article link represents given they share an edge in wikidata!
For example: https://www.wikidata.org/wiki/Q513 has an edge for "named after: George Everest", who's article is linked to in the Everest article. If you could match those up, I think that could add some interesting context to the graph!
Everest -- links to (named after) --> George Everest
I'm guessing you know this, but for the passerby curious about Wikipedia drama:
Wikidata was founded back in 2012 after Google bought & closed its predecessor[2] to make the now-famous "Google Knowledge Graph". It was continuing a wave of interest in knowledge graphs going back to GOFAI (the "neat"[3] approach to AI), most famously advanced by Lenat's Cyc[4] as a path to intuitive algorithms. We obviously lost that particular war to the "scruffies" for good in 2022, but the well-known problems with LLMs highlight exactly why certain, structured, efficient knowledge graphs are also needed.
The aforementioned drama is that the project to integrate Wikidata into Wikipedia's citations has basically been on pause since 2017 after a lot of arguing[5], and this weekend's scheduled discussion[6] seems passive at best. This comes simply from the fact that the "editors" of Wikipedia--the people who spend countless hours researching content for free following strict rules--don't really care about AI paradigms! Specifically, they find the concept of citing the id of a work as opposed to writing out the whole citation dangerous.
Still, Wikidata is the "fastest growing wiki project" and backs a ton of Wikipedia stuff behind the scenes, such as fancy templates for the infoboxes on the top-right of pages. We've only got 1.65B items compared to Google's AI-curated 500B facts, but I have faith that 2026 will be the year of Wikidata regardless!
After all, is a knowledge base curated with scruffy NLP models until it's incomprehensibly-big still neat? ;)
[1] https://wikimediafoundation.org/what-we-do/wikimedia-project...
[2] https://en.wikipedia.org/wiki/Freebase_(database)
[3] [WARNING: 500KB PDF] https://ojs.aaai.org/aimagazine/index.php/aimagazine/article...
[4] https://en.wikipedia.org/wiki/Cyc
[5] https://en.wikipedia.org/wiki/Wikipedia:Templates_for_discus...
[6] https://meta.wikimedia.org/wiki/WikiCite_2025/Proposals#Cite...
One of our projects in algorithms/data structures was to do a BFS on the Wikipedia dump. In 2007.
I made this awhile back for more freeform browsing: https://wikijumps.com
Would love to integrate some of that relationship data
https://github.com/neuml/txtai/blob/master/examples/58_Advan...