Wikia LogoI’ve had a chance to play around with Search Wikia over the weekend. The New York Times provides a broad overview today.

One of the arguments of my book is that a lack of transparency is one–though only one–of the socially dysfunctional forces of the current crop of search engines. Google sometimes reminds me of the Wizard of Oz: Pay no attention to to the man behind the curtain! Or, to be more accurate, the algorithm behind the curtain. Despite it’s seemingly friendly facade, my brother is convinced it is the inchoate Skynet. Google makes the reasonable, if weak, argument that if they make their ranking algorithm public, it will be more easily gamed by search spammers, a claim repeated by Gary Price in the Times article. By keeping that algorithm secret does it lead to more credible search results, or does it damage the search engine’s credibility by hiding the method of ranking from those who must rely upon it.

I guess we’ll be able to find out soon, as Wikia leaves its ranking algorithm open to the public. It also does a lot of other things that, while perhaps not groundbreaking on their own, just sort of make sense. It brings in the edited guide of Mahalo (and something Google has just started doing with knol). It brings in some of the social functions that, again, Google is just getting hip to. One of the main threats may be that Google simply makes off with Wikia’s innovations. It’s easier to be sanguine about that possibility when you are Wikipedia, since you have a first-mover advantage and name recognition. But Google has the ability to sit back, see what works for Wikia, and adopt it. Of course, you could have said the same thing about Internet Explorer and Firefox, but because Microsoft was not agile enough to catch up with Firefox, it lost out. Microsoft also had to battle a backlog of bad will that Google has yet to accumulate.

So enough of that, what about Wikia as it stands now? Well, I can’t claim anything but the most superficial of searches, but I’m reasonably impressed. There are still a lot of rough edges, with some of the features still to be roughed out and some of the working features (cache, etc.) only working when they feel like it. Given that it is working from a shallow, demo index, it’s hard to evaluate the search results. I looked for “Python MP3 id,” repeating a “real life” search I made earlier in the week. I was looking for some examples of code–or a module–that would allow for easy manipulation of MP3 ID tags (artist, title, etc) in the Python programming language. Google’s first hit gets me what I want, but Wikia is still lost in a morass of Czech and Indonesian (“id” is the ccTLD for Indonesia) sites. Generally, it seems to have little preference for English, which could be seen as a good thing if it didn’t make me feel so monoglot. Clearly some word sense disambiguation is going to be necessary pretty quickly here, and that means some understanding of the language being used for the search. It’s also a bit strange that Wikipedia articles are so prominent in Google but not to be found in the Wikia search. Not sure if this was an editorial decision, or just another anomaly of their earliest ranking algorithm.

These are just a couple of a thousand little things that will need to be tweaked in the search process. Wikia is going to have to go through the minutia of tuning the results engine that other search engines have done over time–it’s mainly a question of setting up a structure that will allow that to happen effectively and quickly in an open and distributed way.

I think it’s good that it’s out there. There will be plenty of people criticizing it after the release, but given that Google releases into perpetual beta, it’s probably good to have something that is sort-of running in the public eye rather than leaving it vaporous. It really does seem to be an issue of testing and refinement to me; the bones are good. Go try it; and “friend me” while you’re at it.

I’m hoping Wikia succeeds, since it addresses what I see as some of the root failings of how some of the major search engines are now doing business. If nothing else, it serves as a palpable reminder of those failures, and of the corrosive nature of unnecessary secrecy.

Update (1/7): Well, the reviews are in, and folks are not particularly complimentary. I have to say, it’s not a surprise. For an “alpha” it was handled a lot like a “launch,” including the newspaper stories, etc. My favorite out of this, however, is testing by an anonymous Slashdot user:

Highlighted article when I search for “sex”:

Mini Article About “sex”

Sex is a term which is very often searched in the internet. Thus, a mini-article about pages with free pictures / videos without spam would be important.

First result for “George Bush”

George Bush Is A Crackwhore!
… handjobs for cash. George Bush is addicted to smack … some blow.. yah know… like George Bush … [] – Cached – 1.26

This is genius. I think I know what I’ll search site I’ll use next time I need some entertainment.

