Wikipedia WikiProject AI Cleanup

Maggie Harrison Dupre, Wikipedia Declares War on AI Slop, The Byte, October 2024

AI slop threatens to degrade the useability of Wikipedia — and its editors are fighting back.

Kirill Kudryavstev / AFP via Getty / Futurism.

As 404 Media reports, a team of Wikipedia editors has assembled to create “WikiProject AI Cleanup,” which describes itself as “a collaboration to combat the increasing problem of unsourced, poorly-written AI-generated content on Wikipedia.”

The group is clear that they don’t wish to ban responsible AI use outright, but instead seek to eradicate instances of badly-sourced, hallucination-filled, or otherwise unhelpful AI content that erodes the overall quality of the web’s decades-old information repository.

“The purpose of this project is not to restrict or ban the use of AI in articles,” the battle-ready cohort’s Wikipedia forum reads, “but to verify that its output is acceptable and constructive, and to fix or remove it otherwise.”

Slop Spectrum

In some cases, the editors told 404, AI misuse is obvious. One clear sign is users of AI tools leaving well-known chatbot auto-responses behind in Wikipedia entries, such as paragraphs starting with “as an AI language model, I…” or “as of my last knowledge update.” The editors also say they’ve learned to recognize certain prose patterns and “catchphrases,” which has allowed them to spot and neutralize sloppy AI text.

“A few of us had noticed the prevalence of unnatural writing that showed clear signs of being AI-generated, and we managed to replicate similar ‘styles’ using ChatGPT,” WikiProject AI Cleanup founding member Ilyas Lebleu told 404, adding that “discovering some common AI catchphrases allowed us to quickly spot some of the most egregious examples of generated articles.”

Still, a lot of poor-quality AI content is tough to spot, especially when it comes to confident-sounding errors hidden in complex material.

One example flagged to 404 by editors was an impressively crafted history of a “timbery” Ottoman fortress that never actually existed. While it was simply wrong, the text itself was passable enough that unless you happen to specialize in 13th-century Ottoman architecture, you likely wouldn’t have caught the error.

As we previously reported, Wikipedia editors have in some cases chosen to demote the reliability of certain news sites like CNET — which we caught publishing error-laden AI articles last year — as a direct result of AI misuse.

Given that it’s incredibly cheap to mass produce, limiting sloppy AI content is often difficult. Add the fact that Wikipedia is, and has always been, a crowdsourced, volunteer-driven internet project, and fighting the tide of AI sludge gets that much more difficult.

More on Wikipedia and AI: Wikipedia No Longer Considers CNET a “Generally Reliable” Source After AI Scandal