Wikipedia says it will use AI, but not to replace human volunteers

67 thm 37 4/30/2025, 2:49:45 PM wikimediafoundation.org ↗

Comments (37)

niam · 3h ago

> Scaling the onboarding of new Wikipedia volunteers with guided mentorship.

Depending on what's meant by "mentorship": this is the bit I'm most keen to see. Much of the criticism lobbed at Wikipedia nowadays seems to come from how much of a pain it is to contribute (which perhaps leads to second-order criticisms about censorship, even if that's not the intention).

It's often been the case that the person most-worthy to speak on a subject is not the most worthy steward of a Wikipedia page on it. Any attempt to make those two people one-and-the-same seems welcome.

If this "mentorship" takes the form of UI hints, I think they'd go a long way. Having volunteers take an AI course, on the other hand, might be useful but might also be a complete waste of time.

gwbas1c · 3h ago

I was afraid that mentorship was the point where human-to-human interaction is needed most.

I suspect what you're getting at is that "mentorship" is really code for using AI to step in when people are making the wrong kind of changes to a Wikipedia page. (IE, introducing bias, promoting products, edit wars, ect.)

I'm curious to see how this plays out.

niam · 3h ago

Yeah that's exactly it.

I wonder too if it could be used to help the edit reviewing process, but I can imagine it runs a risk of becoming an accountability sink[^1] if reviewers can merely defer their judgement to what the bot says is OK to purge. It might have a chilling effect on edits if everyone, including procedure hawks, can rely on it in that way. I'm not enough of a contributor there to know.

[^1]: https://news.ycombinator.com/item?id=41891694

itsdrewmiller · 4h ago

Wikipedia already has thousands of bots running around cleaning things up - seems like AI is not really a significant change here.

add-sub-mul-div · 3h ago

Automation is not AI and pre-LLM-era AI is not LLM AI. The arguments for why LLM AI will broadly worsen things are well known. You may not agree with them, but saying things won't change much is pretty empty.

animanoir · 3h ago

why will it worsen up?

cube00 · 3h ago

Unleashing a hallucinating LLM to make edits will creates so many subtle problems on such a scale that it may not be possible to clean it up once other edits are made on top.

kelvinjps10 · 3h ago

I hope they don't use AI to automatically translate articles to other languages, the quality of the others Wikipedias would drop a lot

Scoundreller · 3h ago

Not saying they should, but as a PSA, if you’re anglophone and looking something up about a place in the non-Anglo world, it’s always worth switching to that area’s language and auto-translating. Even the images won’t be shared. Structure might be all different. Perspective may not be the same.

E.g. a city in France’s BRT system:

Contrast: https://en.m.wikipedia.org/wiki/Nantes_Busway

With: https://fr.m.wikipedia.org/wiki/Busway_de_Nantes

amiga386 · 2h ago

Hard agree.

Some random examples where the Wikipedia page in the native language is much more detailed:

https://en.wikipedia.org/wiki/Trams_in_Florence vs https://it.wikipedia.org/wiki/Rete_tranviaria_di_Firenze (and an entire second article for the historic system!)

https://en.wikipedia.org/wiki/La_D%C3%A9fense vs https://fr.wikipedia.org/wiki/La_Défense

https://en.wikipedia.org/wiki/Stasi vs https://de.wikipedia.org/wiki/Ministerium_f%C3%BCr_Staatssic...

https://en.wikipedia.org/wiki/Muslim_conquest_of_the_Iberian... vs https://es.wikipedia.org/wiki/Conquista_omeya_de_Hispania ... but https://ar.wikipedia.org/wiki/%D8%A7%D9%84%D9%81%D8%AA%D8%AD... is better than both of them

Does anyone else have good examples? Or are there counter-examples, where enwiki has a better article than the native language wiki?

bluGill · 3h ago

It would be fine as a translation option. For cases when the page is not found, or the local language is lacking so you want to translate from a different language.

It isn't a replacement for an expert writing the real page in the language though. But there are a lot of languages with very little content. German has less than half as many articles as English (Cebuano is number 2 by number of articles - but that seems fishy. German is number 3). At the bottom Cree has only 13 articles. There are also a small number of inactive wikipedias, some of which may want a translation option as better than nothing.

thadt · 3h ago

Agreed. The bullet point "Helping editors share local perspectives or context by automating the translation and adaptation of common topics" could be unfortunate if not carefully reviewed - and hopefully it will be.

Having recently spent some time doing translations with the current models, my experience is that their output quality can have a rather high variance. That, along with a tendency toward hallucination, often makes them a "fuzzy guess" at best. When I'm the user asking for that translation, I can factor that fuzziness into my understanding of the output. If its already translated for me, I'm depending on someone else to have done the work of making sure it is faithful to the original.

Aardwolf · 3h ago

I actually like that some Wikipedia articles are smaller in my mother language than in English, it gets more straight to the point, so indeed replacing these with auto translations from English would be a loss

nextaccountic · 3h ago

What about, rather than replacing, it were provided as an option?

_fat_santa · 3h ago

Man these AI announcements really highlight CEO groupthink. Though I have to commend Wikimedia here, compared to the Shopify announcement, this one is much more sane and down to earth.

tim333 · 3h ago

>Scaling the onboarding of new Wikipedia volunteers

could be pretty helpful. I edit a bit and it's confusing in a number of ways, some I still haven't got the hang of. There's very little "thank you for your contribution but it needs a better source - why not this?" and usually just your work reverted without thanks or much explanation.

westurner · 2h ago

Could AI sift through removals and score as biased or vandalist?

And then what to do about "original research" that should've been moved to a different platform or better (also with community review) instead of being deleted?

Wikipedia:No_original_research: https://en.wikipedia.org/wiki/Wikipedia:No_original_research #Using_sources

tim333 · 38m ago

I'm guessing it could advise about that even if it didn't make decisions.

film42 · 3h ago

I'm guessing any useful use of AI has already been adopted by some volunteers. Wikipedia might be able to build tools around the parts that work well, but the volunteers will end up footing the bill for the AI spend. Wikipedia will probably pivot to building an AI research product which they can sell to universities/ b2c.

tempfile · 3h ago

> Wikipedia will probably pivot to building an AI research product which they can sell to universities/ b2c.

Why would they do this? All of wikipedia is publicly available for any use. They literally do not have a competitive advantage (and don't seem interested in it, either).

film42 · 3h ago

Exactly. But using AI to summarize articles, stitch them together, etc. under the Wikipedia brand as a product is something they could easily sell. I can totally see a university buying WikiResearch™ for every student.

some_furry · 2h ago

I don't anticipate them selling anything, ever.

notorandit · 3h ago

Especially because they are free.

Havoc · 3h ago

I could see it being a good thing with very careful guardrails and conservative use

hexator · 3h ago

This is a great way of applying AI. I wish more companies followed suit

nailer · 3h ago

As a moderator, I wish StackOverflow would. I’m getting tired of manually marking answers that should be comments, I’m almost about to make a browser extension for it.

some_furry · 4h ago

I appreciate that they emphasize the importance of their human volunteers. Too many bullish-on-AI folks have misanthropic incentives, and it's refreshing to not see more of that. (It probably helps that Wikimedia is a non-profit.)

scudsworth · 3h ago

- Supporting Wikipedia’s moderators and patrollers with AI-assisted workflows that automate tedious tasks in support of knowledge integrity;

so, a chatbot on top of some tooling probably

- Giving Wikipedia’s editors time back by improving the discoverability of information on Wikipedia to leave more time for human deliberation, judgment, and consensus building;

extremely vague, but probably the "train a specialized ai on our own corpus to answer questions about it" style bot helper? these make stuff up left and right too

- Helping editors share local perspectives or context by automating the translation and adaptation of common topics;

automated translations would be a big step in the wrong direction

- Scaling the onboarding of new Wikipedia volunteers with guided mentorship.

you can't do "mentorship" with ai in any real way. all in all seems a box checking exercise for their ML officer.

NotAnOtter · 3h ago

Of course not.

They would never do that.

Industry has a good track record of this.

niam · 3h ago

Maybe you have a reason to be so uncharitable here, but it's unclear what that reason is since "industry" is a broad term.

ChrisArchitect · 3h ago

Title is: Our new AI strategy puts Wikipedia’s humans first

rvz · 3h ago

Narrator: They (eventually) will.

Just look for the keyword "streamlining" in any sentence and they will.

the_af · 4h ago

The tasks for which they are planning to use AI make sense. These are good use cases.

What must be avoided is Wikipedia becoming a repository of AI-generated slop (and possibly feeding the next generation models, becoming a recursive loop of even more slop).

But this way? It makes sense. This won't create content for articles, it's just assistance for editors.

photochemsyn · 3h ago

Suggestion: use LLMs coupled to playwright/puppeteer etc. to detect broken links to supporting references, and also use LLMs to analyze whether the supporting reference really does back up the claim being made in the Wikipedia article.

This raises the thorny issue of what constitutes an 'authoritative reference' and given that many if not most such references are hidden behind various paywalls, and that the editors and volunteers are anonymoous actors with unknown special interests and biases, the conclusion is that Wikipedia is not a reliable source of information, and a wikipedia citation is essentially useless and should never be allowed in any reputable publication.

card_zero · 3h ago

It's been said since the beginning, by Wikipedia, that the idea is not to cite the page itself.

https://en.wikipedia.org/wiki/Wikipedia:Citing_Wikipedia

> Normal academic usage of Wikipedia is for getting the general facts of a problem and to gather keywords, references and bibliographical pointers, but not as a source in itself.

There is always an issue about claim not in source. Often this is a matter of perception and has to be decided by an argument about what the source means, or what a conventional reading of the source is.

MarkusQ · 4h ago

What could possibly go wrong?

thr0waway001 · 4h ago

Could be helpful.

NotebookLM Audio Overviews are now available in over 50 languages (blog.google)

Reversible computing with mechanical links and pivots (tennysontbardwell.com)

Xiaomi MiMo Reasoning Model (github.com)

DeepSeek-Prover-V2 (github.com)

Show HN: Create your own finetuned AI model using Google Sheets (promptrepo.com)

Someone at YouTube needs glasses (jayd.ml)

I created Perfect Wiki and reached $250k in annual revenue without investors (habr.com)

Show HN: ART – a new open-source RL framework for training agents (github.com)

Google Play sees 47% decline in apps since start of last year (techcrunch.com)

Sycophancy in GPT-4o (openai.com)

OCaml's Wings for Machine Learning (github.com)

Archil (YC F24) Is Hiring a Distributed Systems Engineer (In-Person, SF)

Jepsen: Amazon RDS for PostgreSQL 17.4 (jepsen.io)

You Wouldn't Download a Hacker News (jasonthorsness.com)

Retailers will soon have only about 7 weeks of full inventories left (fortune.com)

Port of Los Angeles says shipping volume will plummet 35% next week (cnbc.com)

The Leaderboard Illusion (arxiv.org)

Show HN: Kexa.io – Open-Source IT Security and Compliance Verification

Joining Sun Microsystems – 40 years ago (2022) (akapugs.blog)

The True Size Of (thetruesize.com)

New atomic fountain clock joins group that keeps the world on time (nist.gov)

What Is "Induced Atmospheric Vibration"? (physics.stackexchange.com)

The Mira Pro Color is Boox's first color E Ink monitor (theverge.com)

Why can't Ivies cope with losing a few hundred million? (economist.com)

What It Takes to Defend a Cybersecurity Company from Today's Adversaries (sentinelone.com)

Researchers are studying how to minimize human impact on public lands (undark.org)

My sourdough starter has twins (brainbaking.com)

"AI-first" is the new Return To Office (anildash.com)

The missteps that led to a fatal plane crash at Reagan National Airport (nytimes.com)

Linux in Excel (github.com)

Finland Bans Smartphones in Schools (yle.fi)

Elon Musk Just Doesn't Understand the Sci-Fi Visions of Iain M. Banks (lithub.com)

Show HN: Beatsync – perfect audio sync across multiple devices (github.com)

Bamba: An open-source LLM that crosses a transformer with an SSM (research.ibm.com)

An illustrated guide to automatic sparse differentiation (iclr-blogposts.github.io)

Lessons from the Lebanese Space Program – Kasurian (kasurian.com)

WorldGen: Open-source 3D scene generator for Game/VR/XR (worldgen.github.io)

Everything we announced at our first LlamaCon (ai.meta.com)

It's School time: Adventures in hacking an old Kindle (samkhawase.com)

I use zip bombs to protect my server (idiallo.com)

Janet Jackson's cursed bassline was the scourge of notebook makers for years (pcgamer.com)

Mission Impossible: Managing AI Agents in the Real World (medium.com)

Chain of Recursive Thoughts: Make AI think harder by making it argue with itself (github.com)

We need more optimistic science fiction (craig-russell.co.uk)

How the Internet Left 4chan Behind (newyorker.com)

Show HN: An MCP server for understanding AWS costs

Only Teslas exempt from new auto tariffs thanks to 85% domestic content rule (fuelarc.com)

ArkFlow: High-performance Rust stream processing engine (github.com)

Doom GPU Flame Graphs (brendangregg.com)

Secret Deals, Foreign Investments: The Rise of Trump’s Crypto Firm (nytimes.com)

Wikipedia says it will use AI, but not to replace human volunteers

Comments (37)