Filmed on Monday November 14, 02005

Clay Shirky

Making Digital Durable: What Time Does to Categories

Clay Shirky is the most riveting of speakers at tech conferences, with his deep insight into social software and the culture and economics of networks. His talk for the next Seminar About Long-term Thinking takes on one of the most intractable problems of the information age: how to preserve digital information and tools in usable condition beyond ten years. The continuity of civilization is at stake in this matter.

Categories go nova

It is fortunate that the leading thinker in “social software” is one of the best speakers in the high-tech world, a hot ticket at any conference that can get him. Clay Shirky gave one of his dazzling presentations Monday, Nov. 14, examining a new dimension in one of the most vexed problems in the digital world— how the hell do we keep anything digital usable beyond ten years?

When a whole civilization goes digital, as we are, loss of continuity becomes a crucial issue, fit subject for a Seminar About Long-term Thinking. Thus…

Clay Shirky is an adjunct professor at New York University and, among other provocations, runs a mailing list on “Networks, Economics and Culture” at http://tinyurl.com/a6mt6 . Sample Shirkyism: “The only group that can categorize everything is everybody.” That defies 3,000 years of intellectual practice (Library of Congress, etc.), and it obviously can’t work, but it blithely does work in a Googlized world, and over time it’s the only thing that can work, but time introduces other problems.

“THIS is what the Internet has been straining to become,” said Clay Shirky Monday night, both joking and meaning it. He was referring to a category (”tag”) that emerged from users on the photo-sharing site Flickr. The category is “cats in sinks.”

Growing use of the unlikely seeming tag exposed something that a lot of cats do and a lot of people feel compelled to photograph…

Shirky pointed out that “cats in sinks” has none of the limitations of former category systems such as the Dewey Decimal System or the Library of Congress scheme or Yahoo’s hierarchical category structure. There is no need for a category “cats” with subcategory “in sinks,” nor a category “sinks” with subcategory “cats in”. The specificity of the category precisely fits its content, its traffic, and its currency. (Unlike the Dewey Decimal system which has 10 categories under “Religion,” 8 of them about Christianity and one for “Other Religions. And unlike the Library of Congress system, which retains outdated categories like “Former Soviet Union” and gives equal value to the categories “Asia” and “the Balkans.”)

“The only group that can categorize everything is everybody,” said Shirky. And that’s what web services like Google, Flickr, and Del.icio.us makes possible. Flickr links up users, photos, and tags. Del.icio.us links up users, websites, and tags. All you need for a comprehensive category system to emerge is links and tags! There is all manner of overlap, but that’s not a problem (thanks to global search). Instead it’s a virtue. A high degree of overlap (or redundancy, or “degeneracy” as Shirky called it) makes the system far more robust against disruption and against the erosion of time.

An example of the virtue of overlap is the Rosetta Stone. Having the same text in three written languages was the breakthrough for decoding Egyptian Hieroglyphics, whereas the meaning of Inca knotted-string language and Easter Island’s Rongorongo remain lost because there is no overlapping text.

The title of Shirky’s talk was “Making Digital Durable: What Time Does to Categories.” While he had good news, and deep news, on the category front, he was less encouraging about digital preservation in general. “We don’t know yet how bad the problem is,” he said. He pointed out that there are an alarming number of levels between preserving bits (which is easy) and preserving essence (which is at best expensive and at worst impossible). To make the Bits express the Essence over time, you have to preserve (or accurately translate forward) the Medium; and the Format; and the Interpreter; and various Dependencies; and the Operating System; and the Architecture; and the Power system (is 110 A.C. power forever?) Any of that missing or corrupted or misblended, and all is lost.

In 1995 Shirky published a book called Voices From the Net. Though written and printed with digital files, neither the author nor the publisher Ziff-Davis have a working digital version of the book’s text, but Shirky’s posts in sundry Usenet flame wars from that same period are preserved intact to embarrass him apparently indefinitely. Why did one form of his writing survive and the other not? The book was written and printed with an expensive, brittle system with few users, whereas Usenet is a cheap, flexible system with a vast number of users. Cheap and big always wins.

But for how long? “Preservation is an outcome,” said Shirky. You don’t know if it is working until afterward. All you can do is reduce the risk of loss. Making digital durable, he said, is a “wicked” problem, meaning it can’t actually be solved. It will be an endless process of negotiation.

SALT Summaries Book

$2.99 Also available as a paperback book

Condensed ideas about long-term thinking summarized by Stewart Brand
(with Kevin Kelly, Alexander Rose and Paul Saffo) and a foreword by Brian Eno.

