When you search for anything using any regular search engine, almost instantly, the search engine will sort through the millions of pages it knows about and present you with ones that go with your topic. The matches will even be ranked, so that the most significant ones come first. Certainly, the search engines don’t always get it correct. Sometimes Non-relevant pages make it through, and sometimes it may require a little more search for what you are looking for. But, usually, search engines do work remarkably well.
Unluckily, all search engines don’t have the capability to ask a few questions to focus your search. Moreover, they can’t rely on judgment and past experience to rank web pages, unlike humans can. So, how do crawler-based search engines decide upon relevancy, when dealing with hundreds of millions of web pages to sort through? They follow a set of rules, known as an algorithm. Exactly how a particular search engine’s algorithm works is a closely kept trade secret. Yet, all major search engines follow the general rules below.
The chief rule in a ranking algorithm consists of the location and frequency of keywords on a web page. You can call it the location/frequency approach for now. So if a librarian needs to search for books to match a request of “music” so it makes sense that they first look at books with music in the title. Search engines function the same way. Pages with the search terms showing the HTML title tag are often assumed to be more relevant. Search engines will also check to see if the search keywords appear near the top of a web page, such as in the headline or in the first few lines of a paragraphs. They presume that any page appropriate to the topic will mention those words right from the start.
Another major factor in how search engines determine relevancy is – Frequency. A search engine will evaluate how frequently keywords appear in relation to other words in a web page. Those with higher frequency are often considered more relevant than other.
Story by petricmassa
