When dealing with blogs, we track a lot of stats. To track the stats, we use lots of different methods to try to get the most accurate numbers – whether it’s StatCounter, Google Analytics, Compete, Alexa, Quantcast, or one of the many other stat tracking services out there. The pro (and the con) to using so many different stat trackers is that all of the numbers differ. Sometimes drastically. Here we’re going to break down how each of the stat trackers get their numbers and discuss possible reasons why they might differ.
In general, there are a couple of different ways for websites to track your visits – one is through extrapolating data based on web patterns they collect and analyze. Another is by directly measuring hits to your site.
Let’s look at the first scenario first. There are two major stat provider/analysis companies that utilize the collect & analyze approach – Compete and Alexa.
Compete’s clickstream data are collected from a 2,000,000 member panel of US Internet users (about a 1% sample), using diverse sources.
Alexa computes traffic rankings by analyzing the Web usage of millions of Alexa Toolbar users and data obtained from other, diverse traffic data sources.
The way these two sites work is that they have a specific number of users who install a toolbar (or some other type of software) on their computer. The sites then use the toolbars to analyze the web traffic patterns of those users. Using that data, they extrapolate numbers to apply to the entire web user world. Here’s another way to look at it. Nielson ratings for TV shows are obtained by having a specific number of households having their TV watching patterns recorded. Nielson then applies those numbers and formulates the total number of people who watched a show, based on that sample group.
My general feeling is that while these sites might provide interesting demographic information or even comparison information, they will be far less accurate than traffic recorded directly from your website.
Next, let’s look at statistics providers that measure hits to your site directly. This is usually achieved by having you install a script into your web template. A couple of stat trackers that I use that fall in this category are StatCounter and Google Analytics. But even between those two, the stats delivered vary. Here’s a chart that shows a sample of Google Analytics numbers vs. StatCounter numbers.
The next question is – if both StatCounter and Google Analytics measure traffic to the site directly – why are the numbers different? Here are some reasons that might explain the discrepancy:
- If a visitor returns to your site within 30 minutes of their last activity on the site, Google will count it as a new page view – not a new visit. StatCounter allows you to manually establish the idle time between visits. By default it is set to 30 minutes but can be changed to anything from 30 minutes to 1 week. So if the idle time between visits is set to 2 hours on StatCounter, and a visitor returns after 2 hours, StatCounter would count it as a page view while Google Analytics would count it as a new visitor. The difference can be addressed by switching the configuration in StatCounter to 30 minutes as well.
- BOTS! Google Analytics will not track visits from bots. StatCounter probably does count visit from bots.
What about AWstats you ask? While it’s appealing to have a stat tracker that measures direct server hits, my AWStats gives numbers that are much, MUCH higher than either Google Analytics or StatCounter. This is probably because AWstats will track server calls from bots and search engine crawlers, which will have a big impact on your numbers. (For example, AWstats reported almost 3 times more visits and 8 times more page views than Google Analytics for the month charted above.) I don’t believe that AWstats is as viable a tool for collection of web marketing data. (And it’s not anywhere near as user-friendly as Google Analytics or StatCounter.)
So which is more accurate? The answer is probably both. I suspect that StatCounter overcounts and Google Analytics undercounts and averaging the two might get you very close to the actual number. But unless you want to go through your actual server logs (which I honestly don’t want to do) there’s no way to know for certain. Both stat providers have pros and cons which we can go into in another post. But at the very least this should help explain why your numbers are different.