One of the arbitrators, TheBainer, posted a request for statistical data on vandalism to the Sarah Palin article during that time. That struck me as a very interesting question, although I wasn't entirely comfortable with the idea of judging administrative actions according to statistical data that wasn't readily available to those administrators at the time when they acted. Anybody could read the article history, but that was a very active article in the first days after the announcement of Palin's selection as the vice presidential nominee. It just wasn't a practical idea to sort through that edit history manually with any sort of rigor while the vandalism problem continued to unfold. But this isn't the last time Wikipedia is ever likely to get a burst of attention due to breaking news, so it seems to me we could write a tool to parse that information in real time in a way that's useful to administrators.
I've gotten in touch with a coder who has some very smart ideas and what I'm looking for right now is someone with formal training in statistics. Basically the idea is this: create a tool that parses the recent history of actively edited articles and estimates what percentage of vandalism comes from autoconfirmed users. Automated analysis won't be perfect so it'd give reports based upon two searching techniques:
- High figure: counts all reversions within a time frame.
- Low figure: counts bot-reversions, rollbacks, and edit summary notations such as "rvv"
From there, the tool would determine which editors were being reverted and report on what percentage were autoconfirmed. So if 85% of the vandalism to an article is coming from non-autoconfirmed editors, then semiprotection is the obvious solution. The tool would only report on articles that have a certain baseline of recent activity, in order to screen out low traffic articles where the report would be statistically meaningless.
I'd love to bring in someone who has the skills in statistics to add rigor to the endeavor, so please get in touch if you have those skills or know someone who does.
A second idea (thanks to Xavexgoem) is for something we'd call a dramabot. Instead of crawling all recent changes for vandalism, dramabot would concentrate on articles that have gotten flurries of recent edits. Dramabot would scan those articles frequently and revert obvious vandalism until things calm down. If you're a coder who thinks dramabot would be a good idea, let's touch bases.