Friday, September 05, 2008

Palin by comparison

As of this writing, 48 editors have weighed in on the request for arbitration for the Sarah Palin biography wheel war. The underlying dispute is about whether the article should have been full protected or semiprotected. If you're not versed in the intricacies of Wikipedia debates, that means administrators reversed each others' actions over whether to leave the article open to editing by experienced users or freeze it entirely. There ought to be a better way to resolve that without involving so many people or taking up so much time.

One of the arbitrators, TheBainer, posted a request for statistical data on vandalism to the Sarah Palin article during that time. That struck me as a very interesting question, although I wasn't entirely comfortable with the idea of judging administrative actions according to statistical data that wasn't readily available to those administrators at the time when they acted. Anybody could read the article history, but that was a very active article in the first days after the announcement of Palin's selection as the vice presidential nominee. It just wasn't a practical idea to sort through that edit history manually with any sort of rigor while the vandalism problem continued to unfold. But this isn't the last time Wikipedia is ever likely to get a burst of attention due to breaking news, so it seems to me we could write a tool to parse that information in real time in a way that's useful to administrators.

I've gotten in touch with a coder who has some very smart ideas and what I'm looking for right now is someone with formal training in statistics. Basically the idea is this: create a tool that parses the recent history of actively edited articles and estimates what percentage of vandalism comes from autoconfirmed users. Automated analysis won't be perfect so it'd give reports based upon two searching techniques:

  • High figure: counts all reversions within a time frame.
  • Low figure: counts bot-reversions, rollbacks, and edit summary notations such as "rvv"

From there, the tool would determine which editors were being reverted and report on what percentage were autoconfirmed. So if 85% of the vandalism to an article is coming from non-autoconfirmed editors, then semiprotection is the obvious solution. The tool would only report on articles that have a certain baseline of recent activity, in order to screen out low traffic articles where the report would be statistically meaningless.

I'd love to bring in someone who has the skills in statistics to add rigor to the endeavor, so please get in touch if you have those skills or know someone who does.

A second idea (thanks to Xavexgoem) is for something we'd call a dramabot. Instead of crawling all recent changes for vandalism, dramabot would concentrate on articles that have gotten flurries of recent edits. Dramabot would scan those articles frequently and revert obvious vandalism until things calm down. If you're a coder who thinks dramabot would be a good idea, let's touch bases.

5 comments:

Sage said...

The thing that muddles the picture is that in this case, things would have gone just fine (and would still be going just fine, except there might be a mention of the Enquirer story more often that not) if administrators left the whole thing alone.

Yes, the absolute vandalism rate is very high when Sarah Palin is unprotected. But the relative vandalism rate is not, and anons revert vandalism as often as commit it.

If admins think it's their responsibility to babysit the article, of course they'll feel overwhelmed. But if we would trust the broader community (including anons even), the world wouldn't end and I think the article would actually be in better shape.

This is the only time I've seen where our software and community practices have been simply and utterly overwhelmed by sheer volume of contribution. The other major cases of news-based editing seem to have gone much smoother.

One thing that would really help for this article is better talk page functionality. Section-based discussions (and maybe a functionality to protect or semi-protect the article's section structure) would make it a lot easier to accommodate the number of people who want to edit the article.

Oh dear. My CAPTCHA says "imfuqd".

Stephen said...

To quote what I said to Cube lurker, who asked a question about the propriety of using potential evidence like this in the case:
"I don't know what an investigation into the distribution of sources of vandalism would turn up. It may be that it is impossible to tell when there are so many edits being made. It may be that such an investigation reveals that it was extremely obvious what the source was, and that may have some bearing on people's actions in this case (in terms of what they ought to have known, or ought to have done). It may be that we just end up with a useful tool for assessing whether semi or full protection is the best approach. Any of these conclusions would be a useful thing to know for the future."

Anthony said...

How long can vandalism of a semi-protected article reasonably last? Are there really that many sleeper accounts out there? Or is the term "vandalism" being thrown about loosely?

Lise Broer said...

In a really high traffic spurt the issue is less how long it lasts than how many people see it. Twenty minutes of random keystrokes or obscenities matter a lot less at an article that gets a hundred page views a day than at an article which is receiving one million a day.

Joshua said...

As Doc Glasgow pointed out in the arbitration how many people see it isn't the only thing that matters. If we believe that we are trying to prevent harm to the article subjects then when the person becomes sufficiently famous problems on their Wikipedia page only look bad for us not for that person. If we are going to take BLP protection seriously we need to think about the underlying motivations.