Monday, November 22, 2010

reading notes 11/22

Hawking's Web Search Engines: I find it interesting that search engines must "reject as much low-value automated content as possible" (86). I always thought that they looked at everything. It is amazing that a search engine can even identify something as "low-value". The explanation of crawling is helpful. It is hard for me to comprehend how a computer can do such challenging work. I love the term "politeness" and "cloaking" (reminds me of Harry Potter!). The seemingly limitless vocabulary of search terms is amazing. I assume almost anything could be a search term and that words like "a" "the" etc are not indexed. I am amazed at how much a search engine can do in less than a second, especially when you think about how big the web is and how it is ever-increasing in size.

Shreeves OAI: This seems like a good way to help illuminate things in the "invisible" web and I feel like it have a pretty good following. I kinda found the article confusing tho.

Bergman: Just a side note I've never heard of the Northern Light search engine he speaks of as a top one. This is a pretty interesting about the Deep Web which I never much thought of (is that because I constantly use google and don't think about what it can't reach?). It really seems to be a promotion for BrightPlanet...

No comments:

Post a Comment