Home Links

Research of search engines

Successfully to advance a site in search engines, it is necessary to understand, how they work. Algorithms poiskovikov carefully disappear their owners. Whence it is possible to take the information on how functions poiskovik?


Search engines have arisen not spontaneously, and on the basis of the big operating time in the field of information search (discipline « Information Retrieval »). Therefore the majority of base algorithms is published in scientific jobs, and search engines use them with small variations in the programs. Employees poiskovikov quite often tell about details in interview or at the specialized forums. Experts on promotion of a site, communicating on forums, give many useful advice{councils}.


But reading of forums and scientific articles - not a unique method. Poiskoviki it is possible to investigate and experiment them. The most simple way - studying of a code of pages which get in the top 10 results of search.


What at them the general{common}? How this page not on a subject could make the way upward? Why this page of a site has been given?


The found answers will clear a picture and will slightly open details of used algorithm. Newbies sometimes try to find magic percent{interest} of keywords in the text or "correct" length of heading, averaging the data of the pages worth on the first places. But the received numbers remind not a philosophical stone, and the instrument of proletariat, a cobble-stone.


The matter is that all factors of ranging (and their tens) are used in interrelation and consequently studying of one of them without taking into account the others does not give any helpful information. Application of methods of the multivariate statistical analysis can facilitate a problem , but it is a subject of the separate big story.

Sometimes clear a picture experiment can. Having created ten pages with different density of keywords and having arranged them on the created domains new just (to exclude influence of extraneous factors), as a result of search it is possible to see, what from pages appears above on the chosen search. It would seem, the magic key is found, but it not so. Who has said, what the optimum density of keywords is identical to different searches, for pages of different length? And to put experiments in view of all factors in reasonable terms it is impossible.

It is necessary to come back again to researches.


I shall give a pair of recommendations on disclosing algorithms.

First, study job of concrete algorithm, instead of search for all « the formula of relevance » at once. Second, search for such examples of searches and pages corresponding to them where the investigated algorithm is shown in maximum pure{clean} kind.

For example, you influence of weight of a site on algorithm PageRank on search interests. How to exclude other factors? Find pages with very similar text (completely duplicating it is impossible that Google has not excluded one of them from search). Choose from the text such keyword which would be made equally out in both variants, contained in the same elements of pages (heading, the text, meta-tegi). The word (or a phrase) should be rare enough that it was not necessary to search for pages among millions others, but thus popular enough that in results these two pages have not been given only., etc. Set search and compare positions as a result of search. The closer they appear, the it is less influence PageRank on the given search. Repeat similar search with ten other pairs pages to exclude random factors. Comparing the received results, it is usually possible to draw conclusions on as far as this or that factor is important and in what cases he is applied.

The most important, do not overlook to think.


Poiskoviki apply those or other factors not for appearance formulas and that results of search were better. Efficiency of search

It is accepted to estimate by two basic criteria: to completeness and accuracy. The more percent{interest} relevant (corresponding to search) documents among all found, the above accuracy. The more percent{interest} of number of the found documents among all documents that are stored{kept} in base poiskovika, the better completeness. Concrete realization of algorithm estimate still on resursojomkosti search, both from the point of view of volume of the stored{kept} data, and from the point of view of expenses of machine time. Only if the found out factor of ranging or his  detail can improve these parameters, not having called sharp increase in need{requirement} for resources, they are plausible.


So, methods of studying of algorithms poiskovikov are reduced to the following:

Reading of scientific articles about algorithms of search and the specialized forums;

Studying of pages from a top of results of search;

Research of concrete algorithm in maximum pure{clean} kind;

Application of the statistical analysis;

Check of the found out dependences on improvement of completeness, accuracy or decrease{reduction} resursojomkosti.