When Jimmy Wales (one of the founders of Wikipedia) announced plans to create a search engine to compete with Google, some people took it seriously, while others dismissed it as a pie-in-the-sky fantasy. Here’s what Jimbo himself has to say:
"Search is part of the fundamental infrastructure of the Internet. And we are making it open source. Wikia Search will start to change search from being proprietary, top-down, and closed."
Well, an early alpha release of that "open source" search engine is now online, albeit with a very small data set, and you can see the first hints at how the user interface will differ.
Matt Cutts blogged about it with a screenshot of the Nutch relevance scoring display. Michael Arrington called it a big disappointment. I don’t know what he expected, really. It’s Nutch for the search engine and Grub for the crawler – but we knew that was coming before they released it.
What interests me about Wikia isn’t the current state of the index or SERPs. You’d expect those to be next to useless right now. What interests me is whether an open source, community effort can build a search engine to rival the best efforts of large commercial engines like Google. To put it another way, what are the limits of open source?
In The Cathedral and the Bazaar, which is one of the canons of the Open Source movement, Eric Raymond tells the story of how Linux managed to succeed, and of an open-source project that he himself led.
The parallels are interesting enough – like Linus Torvalds and Eric Raymond, the Wikia team has started with existing applications (Nutch and Grub). As with Linux, early releases don’t look like much (I followed Linux from the very beginning, thanks to a co-worker who was on it from the start).
Whether this initial Wikia rises to the level that Eric Raymond described as a precondition for success, whether it will really get enough people excited, I am not qualified to say… but it doesn’t look good. Linux got people excited back in 1991, but the bar was lower, because nobody had done anything like it before.
Sixteen years later, the community may be more jaded, and less likely to contribute to an effort that isn’t necessarily truly open source. The community that they need to engage is also a lot broader.
For me, three big questions arise:
- How will Wikia engage the minds of information retrieval scientists, and not just coders? Writing software that runs and is dependable is one thing – Open Source has made the case that it can do this extremely well. But to build a great search engine, you need great algorithms, which means you need a lot of people who understand (for example) what BM25F does, the pros and cons of using it, etc. – and you need to somehow get these people to work together. Oh yeah, and they’re starting at least 5 years behind. How can Wikia keep the ‘best and brightest’ engaged in an open source project, when the major search engines are hiring talent as fast as they can?
- How will Wikia’s user-input improve search results, rather than helping the usual pissers game the system? It’s hard enough to get a large user community to edit the Wikipedia without melting down – allowing user feedback on every SERP will not happen without a lot of challenges related to scale. How will Wikia respond to spam in general – the problem will only get bigger if they actually gain traction with users. Eyeballs attract spammers like flies to a rotting carcass – and they’ll be probing for weaknesses every step of the way.
- How will Wikia survive success, in the unlikely event that they can solve the other problems? The scale of the physical operations for Google is simply staggering. The sound of every hard drive Google owns, moving at once, would probably be loud enough to knock down the walls of Jericho. Google has huge resources because of ad revenue, hundreds of extremely talented people to work full time (minus 20%?) on solving their growth and scale problems. If Wikia takes off, what will they have?
So is this really Cathedral vs. Bazaar, round 2?
Or is it just Google’s Cathedral vs. Jimbo Wales’ pet project? That will depend on what happens at Wikia, because building a search engine is not the same thing as building software. It’s orders of magnitude more difficult. I wish them luck. I sorta hope that they’re up to it. It would be nice to see the underdog at least make a good show of it.