Surfers are surprised to find Google can locate local sites faster than NZ-based engines, reports CHRIS DANIELS.
They aren't called engines for nothing.
Without search engines - the driving force behind the internet - it would be nigh on impossible to find anything among the morass of information on the web.
The Search Engine Watch site, which contains reviews and news about search engines, contains a list of facts about just how important they are to the web surfer and to companies with websites.
For instance, search engines are the main way consumers find new sites online, and were used by just under 75 per cent of those surveyed.
Another survey found that one in every 28 page views on the web was a search results page and, for the corporates, 42 per cent of those who bought goods from online sites got there with the help of a search engine.
Finally, Americans experienced "search rage" if they did not find what they wanted within, on average, 12 minutes.
Despite being only a few years old, Google has reached and hung on to the title of best search engine, especially for home internet users.
More than 130 million searches are done through its system daily and it sometimes surprises web surfers that it can find New Zealand sites better than search engines based in this country.
The trick to having a good search engine appears to be more in the theory than the sheer computing grunt behind it. What drives Google is a theory about web popularity - if your site is good, other sites have a link to it.
Google software engineer Matt Cutts, speaking from the company's headquarters in the heart of California's Silicon Valley, described to your net how the search engine works.
"Before Google, basically every major search engine more or less went by what was on the actual page. If you wanted to search for red widgets, then some people would put 'red widgets' 1400 times on a page."
That method of searching began to reach its limit, he says. After all, there is only so much to be learned from what is actually on the page.
So the time was ripe for a new idea, and that is just what Google had.
Known as "PageRank", the Google software counts up hyperlinks going to and from a particular website and uses them as a vote for its quality and relevance.
"You could imagine doing something like very simple link counting - if I had 10 links to my page and you had six links to your homepage, at first blush you might think that my page is better because I had 10 links pointing to it," says Mr Cutts.
But numbers are not everything to Google. It is where the links come from that is important. A higher-quality site that links to another will have a lot of links going to it as well, so improving its own ranking.
"Just because Joe Schmo points to me, that doesn't mean that that's a really good vote, but if Yahoo or a high-quality site points to that, then that usually means more."
So this is a way of being able to quantify the quality of links coming to a certain website.
"It's a little bit self-reverential, but the mathematics actually work out pretty well. If a lot of high-quality people link to the New Zealand Herald, then it's a high-quality site. In turn, the pages it links to are high quality as well."
The Google software runs on thousands of cheap PCs that have been networked together and people do not "touch up" the results.
Advertisements are run on the Google site, but it is made clear that the links are sponsored. It is, says Mr Cutts, impossible to buy your way up the Google search engine ratings.
"There are not really any shortcuts, other than producing a good website. Make a good website and make sure people know about it."
The effectiveness of the PageRank linking system only improves as the web expands, say the people at Google. With more links being built every day, the more material PageRank has to work with.
Other major search engines are now doing link analysis in a similar way, although Google does have patents on the software it uses to do all the maths and grade sites.
While the page ranking system it uses is good for finding the home page of a big corporation anywhere in the world, it is less likely to find those sites that are more off the beaten track.
This takes us to what is known as the "deep web" or "invisible web", which some experts estimate is 500 times bigger than the shallow, surface web of sites that can be found by standard search engines.
One company, Bright Planet, says it can find this information buried deep within databases that need specific questions put to them.
The main search engines have software to "crawl" or "spider" documents by following one hypertext link to another.
If the crawler, when indexing a page, encounters a hypertext link on that page to another document, it records it and schedules that new page for later crawling.
If an engine like Google is to discover this information, the web pages must be static and linked to other pages, says Bright Planet.
But on the deep web the information will come from a specialist database, which will answer the request when it is specifically asked. The answer will not, however, remain "posted" on a website for ever.
Bright Planet's search functions are offered to those who want to pay, and it markets itself to companies as providing a way to search down to the bottom of the web to find all sorts of corporate intelligence.
Bright Planet says the deep internet is growing at a much faster rate than the surface is, and around 95 per cent of the information is free.
It seems, then, that engines such as Google and Alta Vista, powerful as they are, are casting their nets only over the surface, unable to plumb the depths of the information ocean.
But unless you are the Secret Service or an enthusiast for finding obscure information from company databases, old favourites like Google will remain the surfer's best friend for some time to come.
Google big daddy of search engines
AdvertisementAdvertise with NZME.