The true distribution of Wikipedia content
Wednesday, August 13th, 2008Given the work Derek & I did on the distribution of topics on Wikipedia, I thought this was pretty funny:
Given the work Derek & I did on the distribution of topics on Wikipedia, I thought this was pretty funny:
I was looking over an article in the (Baltimore) Examiner that reads, in part:
“Some things get really bad — histories, politics, gets controversial that doesn’t get settled easily,” said Bernard Huberman, author of a study, which determined that increased edits make Wikipedia articles “superior.”Not everyone is buying the study, and some even did their own research to test Wikipedia as a trustworthy source of accurate information.
Alex Halavais, assistant professor in the interactive communication program at Quinnipiac University in Hamden, Conn., inserted 13 errors into various Wikipedia articles, including a false addition to the periodic table and the definition of “longitude.”
The Encyclopedia Britannica has announced that they will be cracking open the door on their writing process, just a little bit. They note that collaborative work “is something we’ve always done in creating Encyclopaedia Britannica.” Although this echoes a bit doublespeak (“We have always been at war with Oceania.”), it is essentially true. They argue, however, that this process has never been democratic, and that they do not wish it to be. They argue that the way they do things is different for three reasons: ownership, expertise, and objectivity. Unfortunately, I don’t think they provide a very useful explanation of these benefits.
They begin with “ownership,” which, they are at pains to explain, has nothing to do with property. I believe they are trying for something more like “stewardship.” In any case, their claim is that because they are willing to “stand behind” the product they release, it will be stronger for it. In other words, we can trust their encyclopedia, because we trust them.
There is something to this. I trust my doctor because of where he went to school, where he practices, and in this case, where he teaches. Yes, I also checked his performance, as best I could, but I accept that institutions invest reputation in the people that they choose to affiliate with. This doesn’t mean that reputation is always strong: just because Microsoft “stands behind” or “takes ownership” of WindowsMe doesn’t make it a better product than Ubuntu, which (arguably!) is a product of people standing around together instead. So, basically they are making an appeal to traditional authority: we’ve done well in the past, and we will continue to do well. Wikipedia asks you to trust in the process, not in the producers.
Second, Britannica claims they are different because they value experts. What are experts? That is a really good question.
The plan for the new site goes to great lengths to increase the relationships we have with thousands of our current contributors as well as with new experts recommended or identified by the user community. We are calling this larger group our new “community of scholars.”
Second, it is worth noting that Wikipedia draws on expertise as well, basically piggybacking on structures of peer-review and expertise vetting in the real world, by requiring citation. It is, in practice, a site for summarizing expertise that has already been expressed in the world. This is at least one definition of what an encyclopedia should do. Is there really a need for experts if the work is essentially summarizing existing publications?
Finally, they attack what I think is probably one of the week points of Wikipedia, its NPOV standard, and propose an even worse one: “objectivity.” I wonder if they considered chatting with their “experts” before making this claim. Experts take on an informed and experienced view, but there is rarely a claim that this is somehow an “unbiased” view. To quote from the Britannica article on biology:
This emerging social and political role of the biologist and all other scientists requires a weighing of values that cannot be done with the accuracy or the objectivity of a laboratory balance. As a member of society, it is necessary for a biologist now to redefine his social obligations and his functions, particularly in the realm of making judgments about such ethical problems as man’s control of his environment or his manipulation of genes to direct further evolutionary development.
Unfortunately, and like Wikipedia in some ways, it is not clear which experts are writing the articles, since the “ownership” is by EB, and not the expert, these days. Particularly, for some articles—say, for example, on “chiropractic” or on “abortion”—if they are providing an objective commentator, I really want to know who that commentator is. Transparency, here, is a friend of evaluation.
In sum, if I understand correctly, Britannica is adopting the plan that Nupedia had, lo, so many years ago. Maybe they will be able to make that work, and as Weinberger suggests, it can’t hurt to have a variety of different approaches in play.
I don’t think I can stay in Europe all of September and October without reinforcing the impression some of my colleagues have about my work ethic, which seems to be tied up with how many hours each faculty member spends in his or her office. However, if I could get away, I’d be winding my way to Graz for this workshop:
This workshop aims to develop and bring together a community of researchers interested in discussing the manifold challenges and potentials of knowledge acquisition from the social web.With the advent of the “Social Web”, a new breed of web applications has enriched the social dimension of the web. On the social web, actors can be understood as social agents – technological or human entities – that collaborate, pursue goals, are autonomous, and are capable of exhibiting flexible problem solving and social behavior. By participating in the social web, both technological and human agents leave complex traces of social interactions and their motivations behind, which can be studied, analyzed and utilized for a range of different purposes. The broad availability and open accessibility of these traces in social web corpora, such as in del.icio.us, Wikipedia, weblogs and others, provides researchers with opportunities for, for example, novel knowledge acquisition techniques and strategies, as well as large scale, empirically coupled “in the field” studies of social processes and structures.
This workshop aims to develop and bring together a diverse community of researchers interested in the social web by seeking submissions that are focusing on understanding and evaluating the role of agents, goals, structures, concepts, context, knowledge and social interactions in a broad range of social web applications. Examples for such applications include, but are not limited to social authoring (e.g. wikis, weblogs), social sharing (e.g. del.icio.us, flickr), social networking (Facebook, LinkedIn) and social searching (e.g. wikia, eurekster, mahalo) applications.
Brilliant talk by Clay Shirky:
None of these statements are true:
Truth Talking
There is a video up on YouTube by IJsbrand van Veelen called “The Truth According to Wikipedia,” which quotes Andrew Keen as saying that in the Web 2.0 internet the “truth gets personalized.” Which, to put it simply, is nonsense. If anything, changes in the knowledge society have uncovered a dangerous assumption that we know what the word “truth” means. I am not a philosopher, and don’t claim even to be well read in epistemology, but I think folks are talking at cross-purposes, and tempests are brewing in teapots.
Has “truth” changed? I don’t think so. I think that there is some sort of nostalgic view that at some point we knew some core truths, and that we are losing grip on that certainty. The problem is that certainty itself is the enemy. Wikipedia, I think, plays an important role in distributing knowledge, but it may turn out that its greatest gift to the information society is a spur toward skepticism. Wikipedia encourages a skeptical reading, not just of Wikipedia, but of Britannica, of scientific journals, of religious texts.
Authority of Experts
Do experts have special access to “truth?” No. I suspect that many people who argue otherwise haven’t met many experts. Experts tend to be suspicious of accepted truths.
I suppose I would be considered an expert in certain areas. I have read extensively books and articles written by other experts, and other experts have read a little of my work and, in some way, certified my own expertise. Peer review is a structure intended not to arrive at truth, but at some sort of reliable, useful knowledge. It recognizes the infinite fallibility of the individual, and the necessity of social structures the help to discover and distribute useful knowledge. It suggests that we should be skeptical of individual authority, and therefore open to the possibility that everyone can be the source of knowledge. It is this process of social acceptance, development, and application that makes it worthwhile.
Is there truth? Only in a virtual sense. In pure math, a statement can be shown to be true. In real life, we have evidence supporting opinions. Experts have the experience and training needed to reliably (repeatably) generate evidence that works for a large number of people. That makes them useful. Perhaps past performance in such discovery is a decent predictor of future success. However, there is no guarantee of this, and it is dangerous to assume otherwise. Like investments, diversification is key.
Useful Knowledge
There is a realm in which truth reigns supreme: pure mathematics. When you divide any circle by its diameter you arrive at the same constant, no matter what. If all men are dogs, and I am a man, I am a dog—no matter what.
And, those truths from the virtual world tend to be useful to us in the real world as well. Though a perfect circle may not exist in real life, geometry approximates life to such a degree that we can apply it to the useful arts, and count it as valuable. The number of angels able to dance on the head of a pin—not very applicable to pin design or communicating with angelic beings, and left behind as a conversation not really very helpful to have.
Consensus
I’m not a dog, literally or figuratively. Why? Because by consensus, not all men are dogs. There may be a vocal minority that disagrees, but because the widely held definition undermines one of the conditions of that syllogism, we don’t have to apply its conclusion.
Gravity, you might say, is true whether or not there is a general consensus about its existence. I disagree. Things tend to fall toward our feet when we are standing, and this is as reasonable a predictive assumption as the sun coming up tomorrow. But the idea of gravity—that massive objects across the universe are attracted to one another by some mystic force—is an idea introduced by many individuals, at different times, and formulated by one dude, whose ideas were considered a bit nutty by many of his contemporaries, but are now considered “true.” (Were they “true” at the time? That’s kind of a silly question, in my opinion.)
NPOV
One of the pillars of Wikipedia is the tenet that articles should take a “Neutral Point Of View.” The word “neutral” is meant as a crutch for “objective.” Journalists are supposed to be “objective,” and not allow their personal opinions to interfere with telling a story. The suggestion is that there is a thin black line between “fact” and “opinion,” which is problematic. “Neutrality” suggests that it’s opinions all the way down, and that the best a reporter can do is sample perspectives from the major camps of observers. If there is a lawsuit, you should hear from both parties, for example.
Of course, there neutrality doesn’t extend to some things. Nowhere in the Wikipedia article on the Holocaust is there a significant expression of the opinion that it is a fabrication, in part or as a whole. (This opinion is segregated off into its own article.) Such an opinion is considered outside of what Dan Hallin (who is an expert) has called in another context the sphere of legitimate controversy. There are certain things that the media doesn’t talk about either because they are too unsettled to be touchable, or too obvious to the regular reader to be assailable. Between these extremes, it is possible to be “neutral.”
Basically, NPOV could be rephrased as “don’t express opinions that you know are not shared by most people.” I think we should establish a settlement on Mars, but I recognize that is an opinion, not a fact. (The “should” is a big tipoff there.) I believe Saddam Hussein had no significant deployable weapons of mass destruction when the US invaded Iraq. I think, based on the evidence I have seen, that this is a “true” proposition, but I also know that there are a large number of people who disagree; or, more to the point, there are a large number of people who frequent Wikipedia who believe this may not be true. And so an NPOV requires that I recognize this, and do not state it as a fact. Whether or not something is NPOV is a matter of discussion, and often ends up replicating the question of whether something is True, without using the “t word.” In other words, NPOV is not an attribute of an article, but an attribute of the process of creating an article.
MPOV
Earlier this year, the question of whether NPOV was appropriate for search engine results came up on a discussion of Wikia search. Jimmy Wales wanted to bring the NPOV standard from Wikipedia to search, and I argued (and still believe), that those using a search engine are seeking not a diversity of ideas, but a particular piece of knowledge most useful and applicable to them. I think finding on a search engine is always some form of refinding. If I am searching for information on Hank Williams, I don’t really want to see pages on why country and western music sucks. The search engine, if it is good, might also know that I don’t want to see pages in French, and that I like pretty pictures. It’s dangerous if it makes these decisions without the user knowing (as all search engines do!), but useful. We turn to search engines often not as a source of diverse information, but as a filter against such diversity. We want “My Point Of View” search.
Does this mean search engines make us too trusting of authorities? I think that tendency is baked into a lot of search engines. But that can be countered by both good search engine design, and educating users. The mistake is to suggest that the search engine is perfectible, that it could lead to a collection of truth. What it can do is help sort and organize and filter knowledge, and a device that allows for sorting according to my own defined desires represents the perfect search engine. It needs to be able to reflect MPOV.
Anti-Enlightenment
Some of the critics of Web 2.0—Keen especially—seem to think that new uses of the internet are anti-intellectual, and against the Enlightenment project. This is misguided. The use of reason over authority suggests that we shouldn’t trust ideas simply because they are presented by the Church, the State, or our other Fathers. It argues that certain kinds of discussion can lead to useful consensus.
If anything, Wikipedia is another iteration of peer review. It isn’t perfect, but it suggests that we must militate against acceptance of current forms of peer review as necessarily the best way of developing useful consensus. They work pretty well: well enough for you to trust that your anti-cholesterol medicine may actually lower your cholesterol, may do so without causing a painful, premature death, and that lower cholesterol is somehow something worth having. But that doesn’t mean we should accept it as gospel. Peer review needs to undergo continual peer review.
The End of Truth
So, the issue isn’t that the nature of truth is changing, or that we are losing hold of our ideals. The issue is that we are becoming more skeptical, and that we are remembering more often that “truth” is little more than a comforting fiction.
From Willow (with a small edit), the serenity prayer adapted to wikis:
Please grant me the serenity to accept the pages I cannot edit,
The courage to edit the pages I can,
And the wisdom to know the diff.