Observations on Web Site Stats and Ranks
I’ve been looking around the web for a discussion of various ways of keeping and portraying web site statistics and also for page ranking information. I’ve also had an opportunity to observe a few differences on my own sites and sites I manage. This involves around thirty domains for several different companies including both of my own companies, Neufeld Computer Services, and Energion Publications.
I’m not going to present charts of statistics. You can find some around the web. I am going to present my observations and a small amount of the data behind them.
Ranking Systems
I have used three ranking systems to follow my own sites and/or blogs: Technorati authority, Google Page Rank, and Alexa. Of these, Technorati seems to be primarily a way of comparing popularity of blogs, not by the number of readers, but by the number of links to the blog content. I have found it to be relatively consistent and to give a decent idea of which direction your blog is going.
Google page rank appears at the moment to be largely a prestige thing, but to the extent that I can observe, it seems to be fairly accurate. This doesn’t mean there aren’t significant glitches, with lousy content moving to the top of the heap and good content lost at the bottom. But if I look at my own book catalog, for example, books that I know have been reviewed online many times will have better page rank than books that have not received such attention. My current understanding is that page rank doesn’t play much role in the placement of search results, but that could change. I suspect that some of the same logic used for page rank does play such a role.
Alexa is another matter, as its rank is based on visits to the site. According to the Alexa web site, results aren’t reliable for sites that are not in the top 100,000. I can’t test this directly, as none of my sites are in the top 100,000. I do have a number that are in the 200,000 range, however, along with many that are less popular.
A few weeks ago I decided to try an experiment with Alexa rankings on my least popular sites. These were sites for individual books that are only linked from the book’s catalog page, and thus get only a subset of those who might be interested in that book. Examples include lukestudy.com, liberalcharismatic.com, speakforgod.info, and grieftolight.com. All four of these sites were in the several million range when I started the experiment.
What I did was visit them with a browser equipped with the Alexa toolbar once daily for a few weeks. This was just a quick visit, long enough to allow the page to load. According to Alexa, again, they count only one visit from the same IP each day, and they do have a factor for traffic without the toolbar, but visits by browsers equipped with the toolbar are factored in.
If you check those sites you can see the results. At one point all were under 1,000,000, a change that varied from -2,000,000 to -7,000,000 approximately. A couple are now over 1,000,000, because the experiment is over. What I found was that once I got into the 500,000 to 1,000,000 ranks, my own visits had little impact. There are other ways to confuse the ranking system, but I suspect that the Alexa rule of only counting one visit per IP per day reduces the possibility for really gaming the system.
While I do not recommend or approve of gaming the system, I do think it is valuable to know just how reliable a ranking system is. If one can alter it just through one’s own single visits, then one can’t be sure just what the competition is doing. It appears, however, that Alexa is quite correct–these lower traffic sites simply don’t get enough of a sample to make their statistics usable. My take would be that below 1,000,000 Alexa stats have a shred of validity, below 500,000 they are often reliable, while only below 100,000 are they truly accurate, which accords reasonably well with what Alexa says about them.
I was unable to push any site below 500,000, and the couple that I tried tend to remain in the same range even without my attention. It appears that the ranking becomes progressively more reliable, as one would suspect, as one approaches 100,000. The only surprise there was that a single daily visit from a browser equipped with the Alexa toolbar had that much impact. (Note that I use a variety of browers with various toolbars to observe web sites.)
Now for stats. I have kept an eye on several sources of stats, including AWStats, Webalizer, SiteMeter, wp-stats (for my wordpress blogs/cms sites), and Adsense impressions. I have observed for some time that many of these indicators do not agree on the popularity of a web site. It is not simply a difference in the totals. Sometimes one indicator will go up while another will go down. I have had days on which a site shows more traffic via its stats program, while Adsense reports many less impressions. SiteMeter and wp-stats don’t always agree, though they are very close.
I found quite a number of claims on the web that Webalizer reports higher numbers than AWStats. I have only been using the two side by side for a week, but my results differ a bit here. For any blog or CMS based site, Webalizer shows higher page views and to a lesser extent higher visits. In some cases the page views in Webalizer are more than double those in AWStats. But for simple page based sites the numbers are closer, and I even found some in which Webalizer reported less page views than AWStats. I don’t know why this is and I intend to keep observing. The difference is about 25% overall for he set of sites I used, during the week in question 80,000 page views reported by AWStats and 100,000 reported by Webalizer for the same set.
Because those methods that I would a priori regard as more accurate (wp-stats and SiteMeter) tend stronly to lower numbers, generally substantially lower than either AWStats or Webalizer, I am inclined to believe AWStats more than Webalizer.
Note, of course, that SiteMeter requires that their graphic on your page be loaded, so quick passes by your page might not count. I’m not certain, but I’m guessing wp-stats might be subect to a similar problem. Also, wp-stats is not going to count reads in an RSS reader, but then neither will anything else. They might, however, count the RSS readers’ loads of the feed from your site.
None of this is terribly scientific, but I do feel that I’m getting a better handle on just what each of these methods is good for–and what it is not.