A Map of the Internet

I was looking through my access logs today, and I'm always struck by how many people are out there crawling my site, this basically unknown place on the Internet. What on earth are they looking for. Why isn't Google good enough? They have to be looking for something the other search engines don't provide. If you want map data, at least for North America, there's 3 vendors for it, from what I understand. Now a map of North America is probably a huge amount of data. I wonder if you could buy a search engine's cache? If you want something that no search engine provides, you have to crawl the Internet yourself. I wonder if there's a business there; all you do is keep a cache and sell it to other people. It saves spidering, because everybody and their mother doesn't have their own spider. But the economics are all wrong. The spiders don't pay for the bandwidth they use, and it's easy to write a spider. You don't even have to write one, there's tons available for free. Why buy an archive when you can make your own for free?

— Gordon Weakliem at permanent link