The Invisible Web

Published on 2001-09-21 by John Collins. Please follow me on Twitter for more: 

Introduction

The amount of content on the Internet has grown exponentially within the past few years. Goldmines of information exist on the Net, if only you could find them! The World's leading search engine, Google.com, currently provides links to over 1.6 billion web pages. Even this staggering amount of data is nothing compared to the amount of content that is not indexed by the leading search engines, often referred to as the 'Invisible Web'.

Much of this information is in the form of publicly available databases, such as phone directories, newspaper archives or medical dictionaries. Due to the specialist nature of some of these databases, the content contained within them tends to be exactly what you are looking for and can be of a very high standard.

Search Engine Blues

Much of the difficulty of not being able to access the Invisible Web is the fault of the major search engines. Most major search engines employ automated programs called 'Spider Bots' or 'Crawlers' that literally 'crawl' their way through the Internet from link to link, indexing web pages as they hit on them. Unfortunately, if no links are found to a particular web site, the site will fall through the cracks into the Invisible Web, beyond the reach of the general search users who miss out.

Furthermore, even if a web site administrator submits his/her site to a major search engine, this is no guarantee that their site will be indexed. In fact, many search engines take several attempts to register a site, and on average take two to four weeks to process a submitted site. In the face of such problems, many sites go unchecked by the main search engines and fall through the cracks.

Search the Invisible Web

InvisibleWeb.com * is a good place to start your search for more specialized information. It call's itself "the search engine of all search engines", and so long as you know roughly what you are looking for, it will provide you with a list of relevant sites to continue your search in more detail. For example, if you are looking for newspaper reports on famous incidents then you will only be provided with links for recognized newspaper archives, and not unrelated links that general search engines are so fond of throwing up at you.

The reality is that it is quite easy for anyone to set up a site and keep it quite covert in terms of the public's awareness of it. So long as the search engines cannot find it through links, and it is never submitted by anyone to the engines, it will effectively remain outside the public domain.

Considering that it is estimated that the Invisible Web is growing at a much faster rate than the rest of the Internet, it is possible to imagine a situation in the near future where the major search engines will simply not be able to cope with the pace of Internet expansion. Lets hope that they never become complacent about their technology, I for one will stick with Google as my default home page.


Updated 2020 : note that the above post was originally published in 2001, but is left here for archival purposes. * The linked website InvisibleWeb.com is sadly no longer online.