Tuesday, May 20, 2008

Nichomachean Wikipedia

Aristotle defined virtue as an equilibrium point between two vices. So courage exists somewhere between cowardice and foolhardiness, and if Aristotle were a Wikipedian, good administrative intervention would be a balancing act between interfering too much and acting too little.

Use the administrative tools too little and trolls run wild, driving away useful editors. Use them too much and you'll block the wrong people. (As everybody knows, I've been guilty of the latter. I'm sorry for it; live and learn).

So Hisperian's post reminded me of Aristotle's golden mean:
Half the posts to the admin noticeboards are exhortations to be less timid in dealing with these situations, and the other half are exhortations to be more careful. I'm not going to learn anything from this unless you spell it out for me. Hesperian 01:43, 20 May 2008 (UTC)
Shoemaker's Holiday had opened a thread where he estimated a 100 hour waste of his own time interacting with one disruptive editor, and noted that he could have written 6 to 9 good articles with that lost time if administrators had addressed the problem in a timely manner.

Raymond Arritt agreed and estimated the overall cost to volunteer time:
Yes, this specific case has finally arrived at arbcom after hundreds (thousands?) of volunteer hours were wasted. But the matter could have been resolved with less cost to the community. Let's learn from this mistake. Raymond Arritt (talk) 01:11, 20 May 2008 (UTC)
In these instances the community is all too apt to commit the fundamental attribution error and interpret the problem in terms of individual personalities involved in a case. This is what we malign as drama, and the solution to drama is not ignoring it but analyzing for signs of systemic flaws that can be corrected at the policy and process level.

For example, this chain of events:
  • An established but difficult Wikipeidan gradually alienates most of the community.
  • Dispute resolution, friendly warnings, and perhaps an arbitration case ask the editor to reform.
  • Despite these efforts and the passionate defense of a few supporters, the problem behavior gradually worsens.
  • The editor becomes essentially unstoppable: someone from the small core of supporters always steps forward to undo a block.
  • The problem festers and more people notice; frustration builds on both sides.
  • Finally the community holds a discussion to enact a ban.
  • One admin blocks.
  • Another unblocks.
  • The business degenerates into a game of chicken with one side citing banning policy and the other citing wheel warring policy.

If you think I'm talking about some case you know, I am. And a lot of others too. This plays out a couple of times every month because the banning policy and wheel warring policy intersect in ways that aren't well defined, and because nobody has found an effective solution for this type of editor problem.

We ought to be studying these recurring problems on a process level and analyzing them with quantifiable data. One great weakness of the present situation is the lack of organized data collection. We list and categorize bans, not the sanctions discussions themselves, and there's no effort to organize the discussions that didn't end with sanctions. The community has been short-sighted in that regard: preoccupied with keeping track of existing remedies without thought to long-term followup to see which solutions work better than others. The community sanctions noticeboard was a step in that direction because at least it kept a centralized archive, but then the community decentralized those discussions again and then disbanded the board itself.

This leaves us ill-equipped to study cases like the one Shoemaker's Holiday complained about that resulted in hundreds of hours of wasted time. And except in the narrowest of senses, we don't satisfy either Hisperian's or Raymond Arritt's call for data to learn from the experience. I'd like to conference with a statistician and a coder and parse a few hundred of these cases.

We've been relying on rumor and anecdote and drama. We could step up from that and do research.


Evan said...

Of course, we need more data and we need to study these events, which occur over and over with such regularity and appear so similar to each other. These phenomena are almost completely predictable. At Durova's prompting, I have made a couple of baby steps towards a more quantitative analysis and I am trying to brainstorm with others to see if we can develop some different approaches to try and to test.

I will point out that we are observing a classic receiver operating curve (ROC) here. If we are too lenient, the trolls have a field day and we drive away good editors. If we are too harsh, we drive away good editors. We adjust one or two parameters, such as putting articles on probation, and then just move back and forth on the same ROC. We will never be able to tune things perfectly, because the system as it stands is fundamentally not capable of the achieving the goals we have. That is, we want to be as productive as possible. So we should be studying things like the productivity problem that Shoemaker identified, and then quantifiably testing methods for trying to correct these productivity problems. To do otherwise is just an exercise in futility.

Lise Broer said...

Martinphi e-mailed and asked me to post the following on his behalf (he got caught in a glitch with the Blogger interface):

"One of the main problems is that some articles only attract zealots. To become an admin, you should have to have worked on 5 or 10 of these. There are a lot of people who want to become admins, and they are often good NPOV editors. Just this rule change would mean that the articles like Homeopathy or Creationism don't have to be fought over by a horde of zealots. They'd have wider community involvement. The whole atmosphere should change if you have say three or four admin wannabeez on there. You could have a noticeboard where users could ask for such involvement. The article could be nominated to be on the 'admin qualification' list.

"But, another main thing is that articles which make it to FA should be locked. Otherwise, they degrade."