Alexa and PageRank

Alexa and PageRank

Alexa Traffic Rank,  PageRank and the search for hosting

In their attempt to gather any and all shreds of data that can help them decide whether a host is good or bad, from time to time, people end up unknowingly deceiving themselves, putting more trust than recommendable in various sites and tools, and, perhaps even more dangerously, associate false meanings to the data some these provide. This page deals with Alexa Traffic Rank and PageRank and tries to explain in simple words what they are and how they can be used effectively. 

Alexa Traffic Rank

The Alexa Traffic Rank is meant to be a measure of a website's popularity. The whole principle behind it is that each time a person who is using the gizmo called "Alexa Toolbar" visits a page on a site, Alexa knows about it and takes it into account. The more such page views, the better the Traffic Rank of the site (#1 is the best rank). Also, the more unique individuals visit the site, the better the Traffic Rank. So far so good, and it makes sense.

However, in practice, its accuracy is skewed (or biased), and this is acknowledged partially even by Alexa itself. Not only that for sites like Alexa.com, Amazon and others the Traffic Rank will be favorably skewed by "natural" factors as explained on the page I linked to earlier, but there are certain sites that will be disadvantaged because of the nature of their traffic.

For example, sites that attract people who are in their vast majority not specifically endowed with technical skills (at least when it comes to the Internet), will be disadvantaged. Most often these visitors simply use the browser that comes with the operating system and they are even wary of downloading software and installing it on their computers. Most of them have no idea what Alexa is and there are slim chances of them ever installing the Alexa Toolbar.

On the other side, sites that attract webmasters, web developers, and other such professionals, who are significantly prone to being aware of the existence of Alexa and installing it on their computers, have higher chances of getting a high Alexa rank.

To make matters even more complicated when it comes to the accuracy of the Alexa Traffic Rank, it is known that  webmasters can abuse the system and increase their sites' Alexa Traffic Rank by generating fake traffic.

PageRank

PageRank is Google's patented way of assigning a certain "trust value" to a public web page, as part of their search engine ranking algorithm. To see the PageRank of a Page, one needs to install the Google Toolbar (at the time of this writing being available for Internet Explorer and Firefox). The PageRank as it is presented to us, is a value between 0 and 10.

The logic behind PageRank is that valuable, trustworthy, authoritative web pages, will generally be popular, and that this popularity can be measured by seeing how many other web pages link to this one page. Each link is considered to be a positive vote for that page. The PageRank (popularity) the page that issues a vote (a link) is also important. A page with a higher PageRank has more voting power, which is distributed towards the pages it links to. Obviously all this makes some sense, and virtually all search engines today use a more or less similar technique as part of their algorithms.

One of the most obvious inherent flaws of this is that not all links are positive votes. At least in theory, a page could be widely criticized, which in turn would make it popular in terms of inbound links, but that popularity would be a negative one.

Say a hosting company is simply despicable and people, its ex-customers, often complain about it. In this process, they will sometimes link to the company's website. Each such link would be interpreted by Google as a positive vote, and this would lead to the search engine assigning it a higher PageRank.

Then there's also the influence of what I would call "impersonal" links. These are links that should not bear any kind of value. Some such links are those from certain web directories. While some directories accept to list only sites that after human evaluation are seen to provide above a certain minimum subjective "value" threshold, many directories are not reviewing the sites at all. But their link, even though it bears no actual value, will be taken into account as a positive vote by the search engine.

Even directories that do take the time to evaluate sites, sometimes can't and don't do much in terms of evaluating. A hosting company will generally be listed in a directory as long as the site works and appears to be properly maintained. That however says absolutely nothing about the quality of the services rendered by the company. No real vote is being cast, but the search engine will consider it to be a positive one.

If you thought things weren't complicated enough already, don't worry, they get more complicated. Websites' owners in general, and hosting companies in particular, can buy links from other websites, be it in the form of banner ads or simply in the form of text links, and in this way they can actively influence the way search engines perceive them. They can actively influence the PageRank of a page.

Leaving all those critiques aside for a moment, let's assume a best case scenario: that a wonderful hosting company gets popular as a consequence of its high quality service and that this is what makes its site have a high PageRank. Years later, due to various causes, the company can suddenly become a shadow of its former self; the quality is now gone. Such swift changes in quality are rather common in the hosting industry.

Despite the decrease in service quality, the PageRank of the site is likely to remain largely unchanged for a long time, even years - for as long as the links to it remain in place. You'd think that the links to the site would disappear as soon as word gets out that the company is not what it used to be, but things don't happen that way on the Internet. The inertia is quite significant. There are sites and pages that haven't been updated for years. Yet another reason to not consider PageRank a measure of service quality.

Using Alexa Traffic Rank

After taking a close look at what Alexa Traffic Rank really stands for, I can't really find a very effective way of using when it comes to selecting a hosting company, especially considering that most hosting companies won't have a huge traffic anyway. While some might consider a host with significant traffic to be successful, more stable and reckon that because of this they're also more trustworthy, I cannot not concur. 

At most, I will take notice of a very low Alexa ranking, because it may suggest a new, and possibly unexperienced hosting provider. Further investigation would be needed though in order to determine whether that is the case or not.

Using PageRank

PageRank can be used in a few ways when choosing a host, but only as a telltale sign. I find that I don't put a special weight on high PageRank values, but I do pay a bit of attention when I see a very low PageRank.

Why is that? For one, Google bans sites that use questionable tactics to improve their search engine rankings. In such situations the reported PageRank is either 0 (PR0) or the bar simply doesn't show anything; it remains gray. 

However, if the PageRank is 0 (zero), it may very well be that the site is simply very new, and Google did not yet finish estimating (and updating) the actual PR of that page. Corroborated with a Whois search on the domain, a decision can be drawn regarding this. If the whois shows an old domain, there's still the possibility of it not being actually developed and advertised in its previous years of existence. Archive.org could be used in this case, in an attempt to see the site's evolution in time -- or a lack of any record of it.

It might already sound complicated, but, because archive.org is not flawless, and one can even choose to not have his/her site indexed by it, even if it has no record of the site we're investigating, there remains a small chance that the site was actually developed and open for business years ago.

If the site really is old and appears to be quite established on the net (a search for the domain name will give another rough estimate of its popularity), and yet the PR is 0, there are chances that a Google ban of some kind is in place. But then there's always the possibility of a Google Toolbar quirk as well.

Tired and confused already?

That would be no wonder. I'm getting confused as I'm writing it myself, so let's try to put the whole thing in a very simple form. Alexa Traffic Rank and PageRank cannot be used as direct measures of a host's service quality or trustworthiness. They're simply not designed to be used for that. If we're really keen to use them for something, we can use them as warning triggers, and investigate further if the situation dictates so.