Saturday, January 31, 2009

Context

Is it just a damaged old stereogram of a construction site? If you know someone who lives in Tel Aviv, please show them this photograph and ask whether they can identify the location. Because if this is part of the White City then what you're looking at is a World Heritage Site being built.

According to the Library of Congress bibliographic notes, this image was published in 1936 but possibly taken during the 1920s. They have a few others also. If it's possible to get confirmation of an exact location on any of these (preferably with a photograph of the site as it appears today), then there may be enough historic value to justify a featured picture and a day for the restored version of one of these images on Wikipedia's main page. Without that information, though, these are just construction sites.

Context makes all the difference.

Friday, January 30, 2009

Banding together

About time we got back to that Benjamin Harrison portrait, isn't it? Here's the version selected by Awadewit and the IP editor. There's a lot to be said about it. First esthetics, then technicals.

Let's be frank: General Harrison here is a weak imitation of Jacques-Louis David's famous portrait of Napoleon crossing the Alps. It doesn't make sense why the Union flag is in tatters while the soldiers' uniforms are all spotless. Ben himself looks at no one in particular and raises his arm with about as much enthusiasm as he'd use to hail a waiter for a refill of coffee. It's technically a fine piece of work, but full of late Victorian conceits and the only hint of psychological depth is the horse who knows it's in a second rate piece of artwork and is ready to collapse of embarrassment.

Yeah, it's not to my taste. "But it's representative of the period," Awadewit countered. She has a point: it is that. Down to business then.

If you haven't already heard of a histogram, now's a wonderful time for an introduction. It's an almost magical little tool that's great to preview at the very beginning of a restoration and then use in earnest at the end of one.

Basically it takes all the pixels on an image and reports on their brightness, from 0 for absolute black to 255 for pure white. It produces a graph that shows the distribution of data, which is very useful. Even more useful: it lets you manipulate the information. There's a complicated Wikipedia article about it if you like equations, but actually it's quite simple and intuitive.

When images fade, what happens is they lose data on the extreme ends of the scale. The whitest white isn't pure white anymore and the darkest black becomes a shade of gray. But the overall distribution of data keeps a similar shape. So with a histogram you can move the zero point up to the lowest number where you've got actual data, then move the 255 point down to the highest number where you've got actual data. You can also shift the midpoint around if you feel like doing that.

It's a very good idea to preview a quick histogram fix at the beginning of a restoration. That's done through the 'levels' option, or 'auto levels' where the software does its best to guess what a histogram adjustment ought to be. This gives a glimpse of any subtle problems that might arise later.

First of all, remember that histograms are dumb. A histogram can't tell the difference between the tear at the upper left edge of this page and coloration that's actually supposed to be there. It doesn't understand scratches, dirt, or stains. As a restorationist you have to take care of those things yourself. And since dud data affects averages, you really need to un-preview the material to do the actual work of restoration. The ultimate results come out much cleaner that way.

Here, though, the issue is banding. You'll see several vertical lines that have nothing to do with artistic intent. Possibly that could be a result of the document having been rolled into a tube for storage.

Here's a closer look at one of those bands. It shoots upward from between the horse's ears.

Banding is not an easy problem to fix. If you aren't experienced or are easily discouraged, it's better to pass up an image with serious banding problems and look for something simpler. Otherwise the problems with this image are minor; I cleared out the rest in an hour and a half. First fix the dust, then the tear. Banding goes late in the game, because depending on what's being done the best solution is often to use the healing brush at a large pixel selection and break up those lines so they aren't visible anymore.

Key lesson here: always save a working copy immediately before changing the histogram. That's why most of my restoration filenames end in the number 2 or the letter B: the number 1 version is the pre-histogram save. If more changes turn out to be necessary, that's the source to go to for further work.


And that, at present, is where this restoration is. I had passed it to a friend who has good luck with lithography, but it fell to the bottom of his workpile so I took it back the other day. Worked on it quite a bit last night, but after histogram adjustment remaining banding issues appeared on the horse's body and the grassy field. It's going to take a bit more work yet to get General Harrison ready. Here's the not-quite-ready number 2 version. I'll save over that when final restoration is really complete.

A point worth remembering: at 114.6MB the number 1 restoration can't be uploaded to any Wikimedia Foundation website. If someone gets the urge to try their hand at an improvement they have to contact me. And if I get hit by a bus, so does this work.

Tuesday, January 27, 2009

Discoveries and tough decisions

Sometimes the most important things are tucked away in archival corners. This is the aftermath at Wounded Knee. The bibliographic notes say "U.S. soldiers amid scattered debris of camp", but I wondered at the size of those piles. Why had the tipi sides been taken down, but blankets left on the snow? Regardless of what was there, this is an important historic scene and a high resolution file. So I downloaded and started work on it.

Most of the images in this post come from the current partial restoration. Here's one from the original file that demonstrates the usual challenge of cleaning out creases and dirt: a small sample of the sky. This is in pretty good shape for photography over 100 years old; everything collects a few problems over time. Sky is usually a good thing to start on; deciphering sky is relatively easy. Worked down from there, saw a tin cup or two in the snow. Then something else.

A shoe. Two shoes. They looked like they were still being worn. The way to find out is to scroll to the right and slightly downward.

A hand. Then a face. There were at least three bodies in the foreground, all partially covered with blankets. Probably four. More piles farther off, the right size and shape.

It's the sort of scene that makes one stop and think. Is it respectful to work on this? Someone someday will probably take this the wrong way, but this is history. It happened. It's important to document these things. So after hard thought I decided to continue the restoration.

It's quite a responsibility. And it makes the choices harder.

Knowing a fair amount about image restoration doesn't make a person an expert in forensics. Very near the bodies there's an unusual spot pattern that seems to follow the contours of the snow. Is that photographic degradation or is it blood? I'm going to make my best guesses with this image, but frankly they're guesses. If some of it comes out wrong there ought to be an effective way of correcting the mistakes. This is one of the days when I wish the Wikimedia Foundation had more restorationists--someone to turn to with greater expertise. So here's one argument for a separate restoration wiki. Someday we may get a forensics expert on board, and when that day comes it'll be very useful to have an archive of interim saves.

Monday, January 26, 2009

A slide show


Found this rather interesting gadget on another blog. Decided to load a few featured pictures and give it a spin.

Dude

One of the best things about archival searching is discovering something new and unexpected. Last night I was looking for material with an unusual aspect ratio to test out a new template for vertical image scrolling. Ideally it would also be suitable for restoration. Unfortunately most of the material that had the right dimensions wasn't in good enough condition. I had already spent several hours on something that ultimately wasn't very satisfactory. The background had banding issues and the attempt at restoration just didn't yield a satisfactory result. Then came a pleasant discovery.

Thumbnail previews can be deceptive. So when this joined the queue of several downloads, it didn't seem worth the trouble to take notes. Just kept on surfing through Japanese prints. When the file finally opened 240 megabytes later it was breathtaking. Imagine a long slow California exclamation of "Duuude!" Had to save immediately, couldn't remember the real title, so it's Dude.tif on my system. Click the thumbnail for a slightly better view.

Turns out this is Zhong Kui, a vanquisher of demons in Taoist mythology. He's known as Shōki in Japan. This late eighteenth century depiction should be restored and ready for upload soon.

Wednesday, January 21, 2009

Cream of the crop


Rotation and cropping are the first, simplest and most important decisions in many restorations. Unrestored images could be interpreted more than one way, and cropping is a powerful method of selecting one interpretation at the expense of others.

Take the caricature of Charles Darwin: is the illustration itself the only thing that matters? Or is the date and publication in Vanity Fair important enough to retain? And if context is important, how much context do we keep? A crop that includes the border text needs balance. And that can be tough to attain from material this old: With many originals from the nineteenth century or earlier, either the paper has dried and warped with age or the borders were never drawn perfectly to begin with. Twenty-first century tastes are accustomed to digitally perfect parallels and can object to variances as small as a few hundredths of a degree.

Here's one where that problem is easy to spot. The borders on this political cartoon of Lincoln and Johnson are off by tenths of a degree, not hundredths. So the bottom border tilts upward from left to right while the vertical borders are reasonably vertical already. One gets an urge to rotate the thing, but every rotation comes out wrong because the border itself isn't rectangular.

There are three potential potential solutions here:

  • Crop the border out of the picture and lose the caption.
  • Manipulate individual lines of border to create an actual rectangle.
  • Leave the caption in and the border lines unchanged.

Any of those choices are arguably correct, depending on how one regards the image.  The easiest one to refute is the first option.  This caption may be obscure after a century and a half, but to me it looks like an explanation of the original artist's choice to juxtapose the president in coat and tails against a reference to his humble origins.  Of all the people who became United States presidents, Lincoln started out life lower on the socioeconomic ladder than any other.  Hence the references to manual labor, which seem to result in a compliment to Lincoln's hard work and perseverence bringing the country back together at the end of the Civil War.  The artist's caption helps explain that; I decided to leave it in.

So if we keep those darn borders, do we fix them?  Do we rotate individual lines and make them correct?  It can be argued that is not artistic intent, that it's distracting, and it ought to be fixed.  It can also be argued that slight variances from mechanical perfection are characteristic of the period, therefore historical, and ought to be kept.  When they performed a digital restoration on the 1939 film The Wizard of Oz they erased the wires that had lifted the flying monkeys.  I'm a Wiki Witch; I prefer vintage monkeys in their original technical imperfection even if it takes me out of the story just a little bit.

So I selected a compromise rotation and cropped a little extra space outside the border lines to minimize attention to that flaw.  The closer an uneven border comes to the edge of the digital image, the more apparent any deviation is.

Ragesoss complained about the final choice here and, to be candid, it wasn't my first crop either.  Originally I had kept the side borders and I'd left leeway outside them because they were a few hundredths of a degree off from true.  But the area outside the border on the original has uneven tone, especially the blown whites at lower left.  And although the file was big enough to work with it wasn't ideal.  I could have filled in the problem tolerably but the technical limitations of the file didn't make it worth the effort.  Overall, for an image that most viewers will see in thumbnail, a good rule is to crop in as close as feasible.  These kinds of decisions are often tradeoffs, and arguable either way.  

Monday, January 19, 2009

Shakespeare's "Howard the Duck"

If you haven't read Titus Andronicus (and Wikipedia's excellent restorationist Shoemaker's Holiday hadn't), there's not much need to regret that particular gap in an education. Even the best of them can turn out one real dud. The only analogy that came to mind was a stretch. If Shakespeare is the Steven Spielberg of the stage, then Titus Andronicus is Shakespeare's Howard the Duck.  Shortly after getting that explanation Shoemaker read the plot summary.  It wasn't the cannibalism that bothered Shoe quite so much as...eh, well...find out for yourself if you dare.

Shakespeare scholar Harold Bloom has claimed that the play cannot be taken seriously and that the best imaginable production would be one directed by Mel Brooks.
Shoemaker, though, was interested in a detail. And that detail is worth attention as an example of digital image management.

The article has a larger version of the illustration above. Obviously the illustration reflects a high technical standard of workmanship and it looks like the book was a good reproduction. A little staining at the bottom border is a minor concern. This ought to be good material for restoration, but it isn't. The hosting page data is the giveaway: (2,080 × 2,789 pixels, file size: 1.34 MB, MIME type: image/jpeg). That's large enough dimensions for featured picture consideration even after rotation and cropping, but at only 1.34 megabytes the data is much too compressed. I've cropped and blown it up a bit to illustrate the problem.

There's just not much to be done with this. The file is too artifacted. Shoemaker checked out the source archive in hopes their original would be better and it wasn't. Which is a shame because this could have been so much better. Lesson for the day: use a lossless format and don't compress files if they're intended for a serious purpose. I wish Wikimedia Foundation software accepted .tif format. Library of Congress does. But the University of Pennsylvania doesn't. And because of JPEG compression and artifacting, their hosting of Titus Andronicus is a dead end.

Tuesday, January 13, 2009

Darwin Day

The other day Ragesoss showed up at my user talk with a reminder that Darwin Day happens next month: the two hundredth anniversary of Charles Darwin's birth. Could we have a featured picture for the occasion?

I'm a sucker for that kind of request.

When it comes to online image archives for that kind of purpose, the Library of Congress website has everybody else beat hands down. Site architecture is chaotic, things can be hit or miss, but when they get it right they really get it right in a way nobody else does. Because what's needed for this kind of endeavor is a hefty TIFF file made from a well-curated original on a really good clean scanner.

A lot of people who don't do restoration come along with an 80K file and expect something to be accomplished with it. Sorry: I can't restore information that isn't there. 2MB is about the minimum, and that's pushing it. 10MB is more like it. I don't call a file large until it's at least 100MB, and one of the images in my current workpile is over half a gigabyte.

The Library of Congress knows how to create and host this sort of material. Also wonderful: they don't try to claim proprietary control over information that's in the public domain (a surprising number of museums and archives do assert such claims, but that's a story for a different day). So after trotting over to the photographs collection and running a search, a serviceable rotograph turns up at a decent 23MB. Here's the page. Not the most famous likeness, but the technical quality is far better than is likely available anywhere else.

Five or so hours later the restoration was complete. Practice makes this sort of work go quickly. The full sized restoration is a nearly 5MB JPEG file (Wikimedia software doesn't allow for TIFF uploads) and available here. It's enjoyable to be able to help out in an event as important as Darwin Day.

Yet it's a shame that for such an iconic figure of British science, the best source for a portrait is a foreign archive. Surely British archives have better quality original images. This is their heritage, their history. I wish more countries brought their collections into the digital age the way the Library of Congress has been doing.

Saturday, January 10, 2009

Heartbreakers

With the new year and more people searching for historic images, it's time to write up a bit of how this works. Basically there are two ways to get started:

Easy:
Tug on the sleeve of somebody who has more material than time to restore it.

Hard:
Go looking for material yourself.

Most of the newcomers seem to prefer the hard way so here are a few words from experience.

Overall, only about 1 in 1000 archival images has the right technical parameters to consider for featured picture candidacy. Many of the others may be encyclopedic or interesting, and worth using at Wikipedia. It's a lot of work getting one image restored so I usually focus on material that has the potential to go all the way. Yesterday I blogged about Toni Frissell's portrait of Tuskegee Airman Capt. Edward M. Thomas. These are a few of the things I wanted to use along the way to locating that portrait.

The first of several attempts was Booker T. Washington. I had recently restored a portrait of George Washington Carver, who had been a contemporary of Booker T. Washington at Tuskegee Institute. That prompted the thought that maybe we could get a featured picture pair. Unfortunately that idea didn't turn out well. Ran into problems like the one above: it's available in a 25MB version, but it's too heavily damaged across the face to really work with.

He must have been a superb public speaker. The Library of Congress has several images of Washington speaking outdoors to crowds, and in every one of him both he and the audience look very engaged. Again--and I call these things heartbreakers--the only one that's available in a reasonable 12MB resolution is one of the weakest of these. The postures of the men near the steps are priceless: the nearest ones leaning forward, mouths open in laughter, while others farther away fold their arms or scratch their chins skeptically. Unfortunately the uneven fade on this image means Washington's face itself is barely visible, the lower right corner is lost, and a partial figure in the left foreground distracts from the composition. There's simply no way to crop the problems out.

These are images I'd love to see in higher resolution, but they aren't available online any larger than this. One lists the location as New Orleans; another, Mississippi; the last one is captioned 'I want our people to have homes.' Wikipedia has one clip of him speaking. You can hear his voice here.

What's frustrating is to get this close to something--so close I can almost hear the cicadas of a sticky Gulf Coast afternoon--and miss the shot. I could restore his home or his wife (although neither would be quite good enough for featured picture either even if they had as much encyclopedic value). Wikipedians who notice my image work usually see only the successes, but this is a glimpse of what's behind that. A whole lot of searching; a whole lot of things that are nearly good enough but just won't fly; a whole lot of material that has the encyclopedic value but not the right technicals.

There's nothing to be done but get used to it and keep on looking. Sometimes you can write and request a better version, but most of the people you'll communicate with don't understand the technicals and the new file--if you get it--may still be unusable. The gems are out there too, and when one finally turns up it's worth all the effort.

And thanks to Moni3, who's working hard to get started, for inspiring today's post.

Friday, January 09, 2009

Cleaning dirt

One of the greatest fears of serious restoration work is that somebody will come along and suppose it was all done by hitting three buttons in Photoshop. Actually it was two days into this one getting the background to a natural tone when the thought occurred 'I'm cleaning dirt.' Why care that much?

The Tuskegee Airmen of World War II were the first African-American pilots in United States military history. This particular image caught my attention because of its composition and the expression in the subject's eyes. I had been searching for portraits of Booker T. Washington, then Marcus Garvey, then Doris Miller. Archival searches yield a lot of heartaches: the photographs that looked promising weren't available in high resolution and the high resolution images had serious technical flaws. Then this turned up. And it was a pleasant surprise to discover it was taken by Toni Frissell, the official photographer of the Women's Army Corps. It isn't often one finds a technically superior image on a significant subject that counters two kinds of systemic bias. Overall, only about 1 in 1000 archival images has the technical parameters to consider for featured candidacy with restoration.

Yet there were serious problems ahead. When the full 25MB TIFF file downloaded it became clear that this would be no easy cleanup. Thousands of tiny white flecks populated the image. This was unusual for a photo less than seven decades old and the problem was probably a shortage of fixative in the original emulsion.

Fortunately the face suffered relatively light damage compared to other areas, but this close-up gives some idea of what resulted. This is culture. This is history. Yes, the restoration took two days: hours on the grain in the cheap plywood wall at left and hours more getting photographic degradation out of the dirt. On an image like this I don't consider the work done until I view the thing at 300% resolution and can't see the flaws, but the labor was worth it. It's worth preserving his face; it's worth preserving his eyes.

It would be interesting to locate enough information about Edward M. Thomas, the man in this photograph, to justify a Wikipedia biography entry. He came from Chicago, Illinois, reached the rank of Captain, and was awarded the Distinguished Flying Cross. This is the text of the citation:

For extraordinary achievement in aerial flight P-51 type aircraft in the Mediterranean theatre of operation. Lt. Thomas’ outstanding courage, aggressiveness and leadership enabled the formation to inflict damage upon a heavily defended airdrome in Athens, Greece. On October 6, 1944, Lt. Thomas flew as a flight leader in a formation of 14 aircraft assigned to strafe the heavily defended Totoi Airdrome and the surrounding terrain. Upon approach of the formation towards the airfield, Lt. Thomas’ aircraft was hit by flak. Disregarding all thought of personal safety, Lt. Thomas courageously pressed his attack to a deck level and destroyed two enemy aircraft and damaged another. Due to his exact judgement, skill and aggressiveness, a total of 11 enemy aircraft was destroyed or damaged without loss to any aircraft of his section. Flying over treacherous, mountainous terrain, under adverse weather condition and against severe enemy opposition, Lt. Thomas flew 81 hazardous combat missions for a total of 205 combat hours. His outstanding courage, judgement, unquestionable devotion to duty and professional skill have reflected great credit upon himself and the Armed Forces of the United States of America.

Captain Edward M. Thomas did not survive the war.

Thursday, January 08, 2009

Quoth the image: "Nevermore?"

Interesting how the digital age has altered encyclopedias. During the 1990s--when Encarta was in its heyday--publishers oriented toward expanding media content rather than scope: it's expensive to use color illustrations in a paper and ink publication, but that cost drops with digitization and new options become possible: sound and video. Where Wikipedia innovated (aside from its open edit structure) was in terms of scope: the economics of dead trees publishing had kept encyclopedias to a certain size. Encarta's premium edition is barely over 60,000 articles. Wikipedia kept adding more articles. Things have come full circle, though, and Wikipedians turned to devaluing media.

A recent example was the featured article candidacy for the biography of United States President Benjamin Harrison. Awadewit reviews featured article candidate images for copyright compliance (bless her for that) and suggested I might find a portrait to restore for featured picture candidacy. So I toddled over to the Library of Congress website where the first charming thing that caught my eye was a caricature from Puck. Lovely, but a bit small even at full size for consideration. Unfortunately none of the formal photographs had the right technical parameters for featured picture consideration, but three lithographs did. Awadewit suggested putting the selection to the article's editors. So I posted to the article talk page. Any one of the three would be an enormous undertaking--they're all over 200MB. And with featured article candidacy underway, editors are at their most eager for any improvement to a page.

How much response did that offer bring? Not one murmur for six days, until an IP weighed in. Thank you, IP editor. To the rest I don't know what to say.

Let's face facts: when people browse Wikipedia they don't skim to the third section and look for brilliant copyediting. They notice the lead image and the caption. If the media elements are interesting they're more likely to stay and read the rest.

I've also located an audio file of Lyndon Johnson's speech when he signed the Civil Rights Act of 1964. In light of this month's events it seems fitting to nominate it for featured sound. The candidacy should go live soon.

Fortunately this year's WikiCup has brought new interest in featured pictures and featured sounds. So might as well announce this openly: winning ain't everything. It ain't even that important. But getting more restorationists is. I've been passing out suitable material to about half a dozen participants and would gladly help more, including follow-up with technical assistance. Shoemaker's Holiday, who restores using different software, is glad to assist new restorationists also.

If we had 100 restorationists each doing 2 restorations a week, Wikipedia could gain 10,000 featured pictures in one year.

Which would be really cool for the project. And would be a lot of help to a lot of smaller language editions of Wikipedia. When you consider that over 180 language editions don't even have 10,000 articles yet, good media content can be quite a boost. It's far easier to translate a caption than to translate a full article.

And in the longer term we're looking to gain access to more national libraries and archives. We want to restore the world's history, not just Benjamin Harrison.

Friday, January 02, 2009