Directories have a structure that people enjoy because it is created by humans. Yahoo! and the Open Directory are the most successful examples of this. The biggest limitation is the speed at which humans catagorize sites. Even with a Editor base of 26,000, the Open Directory isn't able to keep up with the Internet growth rate of 7 million pages a day.
Search engines, on the other hand, have speed on their side. Even then, they are behind in number of pages indexed. The largest full text archive is Google with 560 million pageswell short of 2.1 billion.
Whoopee, you say. Does it really matter how much is indexed if it is already the best 10%? How do we know when we have the best 10%and when to quit?
I've found that the most successful search engine is Google which is a popularity contest of sorts. Netscape's What's Related feature is another way of referring popular sites. Based on technology from Alexa Internet, it follows people surfing (if you have Alexa's 45k software installed.) The choices made by an Alexa user are aggregated and given as related sites.
Even then, when results are found, it is not necessarily accurate. For the most part, intuition serves as the best guide to discern quality info. Sometimes, intuition is even wrong (with due reason.) Is there a way where an search engine can check accuracy and/or validity? Do the Google/Alexa popularity methods help achieve this?
And finally, it is all compounded by the fact that URLs on average have a 44 day lifetime. But that's another story.
- Are we going about finding information the best way?
- Is it imperative that we index all of the web?
- Have directories outlived their usefulness? (That is if we must index everything.)
- Do Google/Alexa methods aid in finding accurate information?