The passive voice swamp

Gary King is one of Wikipedia's most prolific featured article writers; bless him. He asked for a review of his latest effort because no matter how good one is, it does help to get extra sets of eyes on the work. Little mistakes creep up on the best of us. After the fiftieth time reading a paragraph a small error starts to seem invisible. So here's hoping our wonderful volunteer Gary will be a good sport about this post: there's a phenomenon that needs a name.

Here's a sample from the article, which really is very good and close to featured quality already:

Metroid II was released by Nintendo in North America on January 20, 1992,[1] in Japan on January 21, 1992, and in Europe on May 21, 1992.[11] The game was not as well received as its Nintendo Entertainment System counterpart,[3] but it was still given generally favorable reviews, receiving an aggregated score of 80% from Game Rankings.[1] Nintendo included the game in its Player's Choice marketing label.[11] Metroid II is often considered the weakest game in the franchise.[3] Praise focused on the game's story and settings, while criticism targeted its graphics and audio. In their Top 200 Games list, Nintendo Power ranked the game as the 102nd best game on a Nintendo console.[12] Metroid II was also included in's list of the best Game Boy games.[13]

Write one sentence in passive voice and it's easy to write another. Occasionally people just slip into the passive mood, hardly noticing the change. So I told him "Gary, you've wandered into the passive voice swamp," so of course he changed the paragraph and got out of that swamp before any allegories bit him. Yet there ought to be a term for the phenomenon. And now (with forbearance from one of Wikipedia's very best contributors) the blogosphere has one.

Drain the passive voice swamp!

This message is public service from the Wiki Witch of the West.

Digg notices Wikimedia Commons

It's a refreshing change when Digg pays attention to history. Right now this image has 539 diggs under the title 'execution of Lincoln's assassins'.  Who wants to restore it?

Why arbitration enforcement usually fails

David Hoffman sent a very polite reply to my email yesterday about his paper. If he wishes to post his response as a comment to this blog it will certainly get published here. As Sage Ross noted in yesterday's comments, some of Hoffman's findings certainly are provcative.

The most intriguing part of that is the negative correlation between multiple varieties of policy violation, and actual sanction. Although my observations are anecdocal rather than statistical (and it would be very difficult to assemble statistical date on community sanctions for reasons described yesterday), it looks like that counterintuitive finding would not only hold up but would worsen at the community level. In the end this leads to insights about why arbitration enforcement and discretionary sanctions usually fail. First, we'll identify the dynamic: wiki discussion is not well suited to handling multifaceted problems.

Take an example from about a month ago: an editor gets reported to one of the adminitrative noticeboards for edit warring. Simple edit warring results in a block when it reaches a certain level of disruption. Our administrators are usually quite good at handling that alone. In this particular instance, though, the editor might also have used an ethnic slur.

The discussion accordingly went askew, with the distinction between non-pejorative 'Pak' and pejorative 'Paki' debated at length, while the actual edit warring got ignored. One experienced administrator even argued that there was no rationale for a block.

User:Yousaf465 on an Anti-India Propoganda

At this point it became necessary to point out the other factor.
One dynamic to watch out for in admin discussions with multiple issues is that one hot button point dominates the discussion, and if that gets resolved as a nonissue the other outstanding issues may get overlooked. This discussion has determined that 'Pak' does not carry the derogatory connotations of 'Paki'. What it has not resolved is whether this person is edit warring. And it may be arguable that block-worthy edit warring has been going on within the last few hours. Please examine all issues at hand before declaring a determination. DurovaCharge! 05:30, 20 February 2009 (UTC)

Notice how the discussion changed direction afterward: a swift and uncontroversial 48 hour block for edit warring.

So the introduction of a second issue hampers Wikipedia's ability to address policy violations that are obviously block-worthy. Notice what occurs in a more recent discussion where three issues are at play:

Commons:Deletion requests/File:Allys a rubbin (1421413596).jpg

The subjects of this photo have a reasonable expectation of privacy. The Burning Man festival is a private event, see . There is no indication that the subjects have given their consent to have this image used in Wikipedia; without consent this kind of picture can do real-life damage. As an alternative to deletion, pixellating the faces would address most of my concerns. --Clayoquot (talk) 16:23, 15 March 2009 (UTC)

Comment The stipulations in the url given are probably not legally valid. /Pieter Kuiper (talk) 20:04, 15 March 2009 (UTC)
I agree that it's questionable whether the festival could legally enforce the rules given in the URL. I'm not primarily concerned with the rights of the festival. What I'm primarily concerned about is the rights of the subjects, and the URL gives evidence that subjects would have a reasonable expectation of privacy. Clayoquot (talk) 03:05, 16 March 2009 (UTC)
Comment the festival is a private event... that doesn't mean that it takes place in a private place! And place is the issue linked to privacy not events. For instance, the Tour de France is a private event but one may not be able to forbid pictures of its public audience next to the roads because those roads are public places! --TwoWings * to talk or not to talk... 03:11, 16 March 2009 (UTC)
There is no distinction at this particular festival between the event and the place. The question we have to address is basically whether it is both legal and ethical to keep this photo on Wikimedia Commons, in its current state in which the subjects are identifiable. Think about it: You go to a festival, thinking you can be a bit more relaxed about things like nudity because it's a private event and photography is restricted, and you take off your clothes for a massage. Then one day you or someone in your family realizes that there is a naked picture of yourself on Wikipedia. Not nice. Let's not have that happen. Clayoquot (talk) 04:50, 18 March 2009 (UTC)
Well I'm sorry but it seems that the law deals with private places so even in the case of a private event held in a public place, I'm not sure it can be considered a problem with the law. I understand your argument but we have to understand the law too! --TwoWings * to talk or not to talk... 17:01, 19 March 2009 (UTC)
  • Comment There are three separate issues here: the event's photography policy, the participants' privacy rights, and Commons custom. Dealing with these one at a time:
  1. The event photography policy is a contract stipulation between the event and the photographer. If the photographer violates that contract, it has no effect on downstream users such as Wikimedia Commons. So, for example, we do host public domain artwork that was photographed in museums that restrict photography. That's the photographer's risk, not ours.
  2. Privacy rights are a gray area here. On the one hand, the event occurs in the open area on public land. On the other hand, access to the event is rather tightly controlled with checkpoints, barricades, law enforcement, etc. So one could argue this either way and I'm no confident which way that would go. On the one hand, this is outdoors in a public location. On the other hand, do these participants have a reasonable expectation that their likeness will not be taken and misused? I'd lean toward the former by hunch more than experience, and would defer to individuals who know specific instances where this has come up before (in some twenty years of festival history it probably has).
  3. Commons custom has sometimes been more considerate than strict readings of privacy rights. We have, on occasion, deleted instances of 'wardrobe malfunction' that occurred in public places. This isn't quite the same situation as plumber's trousers, since the nudity is intentional. Yet the intention here appears to be massage rather than pure exhibitionism. It's a regular massage table. So primarily on the basis of this third consideration I'd lean toward deletion. Durova (talk) 03:49, 20 March 2009 (UTC)
Thanks Durova. I should mention for transparency that I asked Durova on her enwiki talk page if she could comment here. Clayoquot (talk) 03:56, 20 March 2009 (UTC)
  • Delete I have to say that my first comments (see above) were an overall statement about the event (on the base of the difference between "public event" and "public place") but that I may lean toward deletion for that specific picture (but not for other pictures of the same event) because I follow the same remarks as Durova (3rd point above). Actually we also have to consider that this picture seems to have been taken in a tent so it may be considered as private for that reason (even if the tent was in a public place!) --TwoWings * to talk or not to talk... 09:18, 20 March 2009 (UTC)
In both instances, all that I did was identify the multiple factors that were confusing the discussion, and attempt to rate them. What's curious is that no one else had tried that approach. Many of our site discussions that degenerate into 'drama' are actually multipoint issues, and could conclude rationally if someone steps in at an early stage to identify those points and articulate them as separate issues.

A portion of our site's disruptive editors intuit that weakness and create confusion in order to avoid remedies for their behavior. On rare occasions one of them even confesses that this is deliberate strategy. This occurred in the Gundagai Editors arbitration of late 2006:

Failure to sign posts

The anon editor has consistently failed to sign posts. This is a deliberate strategy on her part.[68]: In response to an explanation by Golden Wattle: "Navigation on talk pages is normally by linking using signatures by the way. If somebody wanted to follow the conversation, and you had signed - they would come here very easily - they can't when you don't sign - have I mentioned signing before? Maybe you might if you could see the benefits." She responded on 6 July "Maybe I wont too. Do you think I dont know about how to make a maze? Its pretty amazing. If you lose the thred though your lost. Have fun"

Arbitration occurs when other site processes have failed. Deliberate smoke-blowing increases the chances that other site processes will fail, so there is a high probability in arbitration that at least one party is a smoke-blowing disruptor. There may even be multiple parties using this tactic, some of whom have formed strategic alliances. A critical mass of disruptive smoke-blowers, acting in tandem, can thwart nearly any Wikipedian process. If they are skilled enough they may even draw in the allegiance of confused but well-meaning editors who are not disruptive themselves, but who fail to see through the disruptive tactics and lend their own reputations in innocent advocacy for the disruptors.

A central responsibility of Wikipedia's Arbitration Committee is to identify individual disruptive smoke-blowers and remove them from the conversation. If the Committee fails to do that when they enact discretionary sanctions on a case, then those same disruptive smoke-blowers proceed to arbitration enforcement and employ the same tactic there. Usually those disruptive editors succeed in their efforts to stymie arbitration enforcement because noticeboard format is less formal, and therefore easier to misdirect than arbitration formats. In these situations, implementing sensible structure is not 'bureaucracy' but a defense against trolling.

Wikitruth through Wikiorder

Browsing the Signpost today led to a scholarly study by two Temple University law scholars. The abstract looked intriguing enough that it overcame my natural antipathy toward PDF files and read the whole 45 page paper. Posting a few thoughts about it here.

More than is usual for this kind of work, the prose is readable and at times engaging. It prompted a few fond chuckles to see the following:
There are over thirty distinct ways to irritate other Wikipedia users, including being incivil, disruptive, or tendentious; researching the wrong way, attacking others’ gender or race...
Ah yes, but aren't there much more than thirty? Surely an enterprising spirit would generate new ones. Otherwise life might get boring.

On the whole, though, their analysis is structurally flawed. David A. Hoffman and Salil Mehra write about arbitration as if it were the only means of banning editors, but of course there are more. The period January 2005 through Septemer 2007 was a critical one in Wikipedia's development of community-based remedies, none of which are mentioned in the study.

To highlight the developments:
  • May 2005: Wikipedia formalizes its banning policy.
  • July 2005: David Gerard provides a definition of community bans: "Some editors are so odious that not one of the 500+ admins will unblock them." In slightly less colorful phrasing this becomes the de facto standard for community sitebans.
  • September 2006: Wikipedia formalizes a disruptive editing guideline.

In the time since then the community has become increasingly proactive in enacting, reviewing, and lifting sanctions. Unfortuantely these are not easily studied because the documentation of these sanctions is extremely diffuse. A page exists to record full sitebans, but not for any other type of community-based sanctions (topic ban, article ban, single revert restrictions, etc.), and the definition of community banning is itself diffuse enough to be disputed: when is an editor banned by the community, as opposed to placed under a block of indefinite duration? Discussion of bans (which may or may not require consensus discussion, depending on who you ask) has roamed across at least three noticeboards. Although automated search tools have been developed in attempt to compensate, they can search only for specific instances where the editor's username is known and the tools may fail to turn up the appropriate result.

Additionally, although the community enacts bans and other sanctions of indefinite duration, it has almost no articulated standards for reconsidering an indefinite sanction. Generally the blocking administrator is held responsible and should be consulted, but there is very little provision for what to do if that administrator is unavailable or under what circumstances sanctions should come to an end. The results of that lack are predictably chaotic.

So although it would be fair to say that a majority of sanctions were enacted by the Arbitration Committee or Jimbo Wales at the beginning of 2005, by September 2007 the minority of editor sanctions were coming from these sources. The nature of disputes heard by the Committee was also changing substantially as the community adapted to handling simple and obvious cases, so by the end of the period under study the character of cases before the Arbitration Committee had shifted toward complex and intransigent disputes for which no easy solution was at hand.

So, setting aside other criticisms (I had originally intended to mention the absence of analysis on wheel wars and other causes of administrative desysoppings, and a few smaller points), Hoffman's and Mehra's attempt to apply complex statistical analysis and game theory to Wikipedia arbitration is fatally flawed.

More functionality at Commons

Commons administrators can now move filenames. No more time wasted on deletion and reupload. Life keeps getting better.

Three cheers for Brion Vibber, Erik Möller, and Michael Dale

Say hello to Wikimedia Commons's first TIFF file! It's a restoration of an 1825 hand tinted illustration for the constellations Aries and Musca Borealis. The full version is available here, and as soon as this hurried and happy blog post is complete I'm going to upload the version that really matters: Aries and Musca Borealis1.tif. That's the interim step in restoration just before histogram and color balance.

What this means is that Wikimedia software has become a lot friendlier to serious restoration work. Editors no longer get forced into offsite communications to trade uncompressed files. With an upload limit of 100MB and (I hope) better thumbnailing ability soon, this is a major step forward for image editing.

Many thanks to Gerard Meijssen for his tireless dedication, to Shoemaker's Holiday for his fine restoration work, and to the many editors who have dedicated their volunteer time to this growing effort.

Wikipedia's most viewed featured picture?

In chat with another Wikipedian last night about the Wright Brothers' first flight and interesting question came up: what is Wikipedia's most frequently viewed featured picture? In terms of ordinary monthly traffic that would probably be a featured picture for one of the site's most frequently viewed pages. Based upon a recollection of somewhat out of date statistics I guessed that would be the featured picture of Barack Obama. Further research led to a surprising discovery and inadvertently sparked a small edit war.

A later check confirmed that Barack Obama received 1,644,252 in February 2009. That makes the article Wikipedia's twenty-second most visted page for that month.

Yet it also turned out the Obama biography didn't use the featured picture of Barack Obama. This was odd. A featured picture doesn't need to occupy the lead position, of course, but it usually appears somewhere on the article where it is most relevant. On rare occasions a featured picture gets removed from its primary article through the well-meaning effort of an editor who doesn't have much experience with image use. Normally that's simple to remedy: just reinstate the image with an edit summary explaining that it's featured.

It would be an uncontroversial and straightforward edit at any other article, but what about at this location? Usually I avoid hot potato topics in mainspace. But what the heck? I thought, The election is over. I'll start a thread at the talk page, to be on the safe side. A lot of disagreement followed. When it appeared that an editor hadn't understood the reason for discussing the image I provided two explanations, then made one edit to demonstrate. Perhaps that was too bold under the circumstances, because this followed:
Yikes. Well, in order to keep things bipartisan I'll be uploading a speech by George H.W. Bush and nominating it for featured sound. Also located a public domain photo of John McCain taken shortly after his release from a POW camp.

Getting it Wright

Restoration work gets done at high resolution, mostly addressing very small portions of a photgraph. There aren't many moments when the restorationist sees the effect of the labor on the whole image. So it's wonderful to finally sit back, be the first to view the end result, and pour a cup of coffee. This is arguably the most important photograph in aviation engineering history. It's an honor to work on a version that hundreds of thousands of people will see in dozens of languages.

The full sized restoration is 14.39MB in JPEG, available here. The uncompressed file can't quite go up in its full glory because it is over 100MB. There may be another editor who can improve on this work. So until WMF software catches up with what we're doing in this area, people who understand restoration are welcome to contact me for a Skype transfer. If we can't collaborate in a wiki environment yet, we can still act collaboratively.

Reviewing history

"Rewriting history" is a phrase that raises eyebrows. It could mean correcting errors in previous scholarship, but more often it prompts worries about aggressive forms of wishful thinking that obscure the record of the past. What's the equivalent with visual records, reviewing history? Somehow that sounds much milder. Perhaps too mild, because the obligation to act responsibily is equally weighty.
So here's the Wright Flyer on its first flight, December 17, 1903. Very important photograph; it ought to be featured. But this edit that was promoted four years ago never was to my taste. Look at that sky: it's a visual essay on why reliance on auto settings for histogram adjustments is not always a good idea.

Most of Wikipedia's featured pictures do not get much individual traffic except for their day on the main page. This one does. Last month it received 4327 individual page views, with many more views in thumbnail version at article space. The Wright brothers article received over 100,000 page views in February. It's also at Aviation (33,914 views), Aviation history, (11,044 views), Transport (49,656 views), Aerospace engineering (36,618 views), and several other English language articles. The image is also featured at Wikimedia Commons and the Japanese language edition of Wikipedia, and it appears at a total of 338 pages on 68 Wikimedia Foundation projects.

The current version is not a very careful restoration. Note the cracking still visible above the left wing. Glass negatives are delicate things. Our current featured picture was edited primarily by cropping away the worst damage. The photograph is equally important either way, but we can do better. The Library of Congress has a much higher resolution version available. I've been working on that during the last day.

So this interim version is what I have now. It isn't finished; there's a bit more to correct before the histogram is ready to go. The toughest decision, though, has been the crop. Obviously this shot was set up for documentary purposes rather than esthetic reasons, but I've decided to take on the difficult task of including more foreground. The light patch of sand has greater continuity with a crop low enough that it approaches the left and right corners, and it seems to add to the sense of awe to raise the horizon a little higher.

This crop brings other challengs. I've managed to get rid of the blue staining at far right, and to remove a brown circular stain near Orville Wright at the edge. The left edge has been harder and there's more work remaining to be done, but I wanted to include as much of the wooden track as possible. Still deciding whether to crop out a few more pixels from the top border; that part of the sky is badly damaged.

But for some reason it's the lower left corner that prompts the most second guesses. The negative chipped and broke and that little bit is unrecoverable. So I've cheated and patched in order to achieve this crop. It's virtually certain there was nothing else there but a few square feet of sand.

A few square feet of sand at Kitty Hawk: a justifiable surmise, but a surmise for entirely esthetic reasons. Does that enter the pejorative sense of 'rewriting history'? 'Reviewing history': thousands of people see it every day, but who would notice? In four years nobody commented on the awful histogram of the current featured version.

Maybe this redirect expresses Wikipedia's priorities.

Durova's law of online community management

Copied from the mailing list. Probably should have been written out in a public setting quite some time ago.

In Internet communities generally, 5% of the participants will violate the rules no matter what they are. 5% will abide by the rules no matter poorly enforced they are. The other 90% would prefer to abide by the rules if the rules are generally enforced, but will also ignore rules if the rules become meaningless. The key to managing a community is to sway that 90%.

The power of free access

This is open access at its finest. Today this image was added as a featured picture at the Turkish Wikipedia. A year ago, when a smaller version was in danger of getting delisted at English Wikipedia due to size requirements, I contacted the United States Holocaust Memorial Museum to request a larger version. They very kindly supplied one, and it has since run at both Commons and English Wikipedia as Picture of the Day.

If you aren't already familiar with the photograph, it comes from the final days of the Warsaw Ghetto Uprising. A picture tells a thousand words, and my thanks go out to the museum staff for providing a high resolution version. Because of their generous and enlightened decision, several million more people are seeing what happened here. May a holocaust never happen again.

New articles waiting to be made

For Wikipedians who think every important subject already has an article, this list of protected areas and species in Tanzania should be required reading. There are far too many redlinks on important subjects in African geography and ecology.

This was an accidental discovery last month when I noticed a redlink for the Thika River in Kenya. What's the Thika River? Is it important? Ran a Google search to find out, and the very first source noted that it supplies 80% of the drinking water to Kenya's national capital. So I started the article.

Now working on Jozani Chwaka Bay National Park, which isn't a very big article yet. Hoping to get it up to size within the five day window for consideration at "Did you know?" It's the only national park on the island of Zanzibar and it's home to the endangered Zanzibar Red Colobus monkey, which lives nowhere else in the world. It'll be exciting to grow this article: the sources are saying that the park is a successful endeavor in ecotourism where cooperative work with the local communities has led to habitat preservation and a rise in the monkey population, while revenue sharing from park entrance fees has given the villages schools, cleaner water, and health clinics.

Which is cool. And there are redlinks like that all over Africa waiting to be filled.