Alexa.com or the "measurable web" delusion.

Wednesday, 29 June, Year 8 d.Tr. | Author: Mircea Popescu

Consider these two screenshots, taken a day apart :

alexa-1

alexa-2

So, according to Alexa, yesterday Trilema's "traffic rank" was 424`043, meaning that of the whole wide Internet 424`042 websites had more "traffic"i in the preceeding three months. The next day, only 422`974 websites had more traffic, in the preceeding three months.

To model this wonder : taking the simple power law f(x) = a x-1.1, and fitting it so f(1) = 1`000`000`000 (ie, one billion) would then require a = 1`000`000`000. Therefore, the trafficii of the 424`043th item would be 645, whereas the traffic of the 422`974 item would be 647. In other words, if the #1 item gets 1bn trafficsiii and the #5 item gets 170 mn traffics, then the 100th item gets 6.3 mn traffics and the difference between Trilema being 422`974 and 424`042 is 2 traffics.iv

Except of course we didn't account for the part where traffics are counted over 3 months and the rank differentials were taken on consecutive days. Therefore, it'd be more proper to say that according to Alexa Trilema had 645 traffics in the interval March 29th - June 28th, and 647 traffics in the interval March 30th - June 29th, or in other words had 2 more traffics on June 29th as compared to March 29th. This view would be fine and reasonable, except it entirely invalidates the graph. Judging from there, the March 29th traffics were below the 1mn range, which, by our trusty formula, should mean up to 251, not anything close to 645.

But hey, maybe the graph is wrong, what. Would this be the first time a graph published on a website is wrong ?

Fine, let's move on. Yesterday Trilema had a ranking of 192`559 in the US ; today it's lower, at 198`706. So it has more traffics, but less of them from the US. The same is true of the other 4 locations listed : from 16`937 to 18`216 in Taiwan ; from 142`658 to 173`395 in Japan and from 30`748 to 30`756 in Romania, with the exception of the UK (124`953 to 117`358). It is therefore implied that even as the total traffics to Trilema increased over the interval, its demographic composition changed : fewer people from the US, TW, JP and RO, and more from the UK and other unspecified locations. Since the locations come with percentiles, which refer to a largely unmoving absolute (645 to 647 in our model), we can do even better fittings for our power law! Consider :

aUS 192559 -? / aUS 198706 -? = 24.3 / 24.

Is that right ? Well, it's got to be : if traffic went from 0.243 X to 0.24 X, rank went from 192`559 to 198`706. Let's resolve this then,

192559x / 198706x = 24.3 / 24.
(192559/198706)x = 24.3 / 24
x = log(24.3/24)/log(192559/198706)
x = −0.395322708

Check out this wonder, our initial assumption was perhaps mistaken, -1.1 is too sharp a power law for the Internet. To calculate the other four :

xTW = −0.571106719
xJP = −1.133451426
xUK =−2.105927668
xRO = 193.856200248

Turns out... they're really all over the place. Even giving the maximal benefit of rounding to the Ro values, for instance, wouldn't fix the curve, 5.8 is still lower than 6.1 no matter how you fiddle it, which makes the exponent positive and the function not a power law. Turns out... the -1.1 I literally pulled out of my ass is actually a lot closer than the very data Alexa providesv !

But let's get back : the percents provided for the specified locations sum up to 24.0 + 22.1 + 12.6 + 8.5 + 5.8 = 73% of all Trilema traffics according to Alexa in the March 29th - June 28th interval, and to 24.3 + 21.2 + 10.1 + 9.7 + 6.1 = 71.4% of all Trilema trafics in the March 30th - June 29th interval. This means a 1.6% (ie, 645 * 0.016 =~10 traffics) contraction of the five sources listed, consisting of a ~8 traffics increase from the UK and thus a 18 traffics decrease from the US, TW, JP and RO, that is in turn offset by a 20 traffics increase from "everywhere else". Considering the threshold for being listed is ~40 traffics, the change isn't enough to bump the putative sixth party into visibility.

All told, the picture Alexa paints can be summed as follows :

  • That the concentrative behaviour of people in the US and the UK is so variant as to account for a difference between ~-0.4 and -2.1 in power law exponents. Thus, according to Alexa, if five USians are asked to split a pie, they will get, from most to least, 28.4%, 21.6%, 18.4%, 16.4% and 15%. Meanwhile, if five UKians are asked to split a pie, the slices will be 70.5%, 16.4%, 7.0%, 3.8% and 2.3%. Good thing they're fighting inequality in the US, wouldn't you say! Who knew the UK was that inequalitous! Good thing they quit the EU huh!
  • That graphs have absolutely no relation to numbers. It's not like anyone went to school, learned functional analysis or can draw a graph by hand through the age old longhand of finding inflection points, fitting sections etc.
  • That a 6% increase in traffics can and does happen in a single day, sloshing around randomly. Specifically in the case of Trilema, a gain of 40 traffics from "countries other than the top 5" happened on June 29th as compared to March 29th. That the UK increased its 3 month value by 1.2% overnight, which is to say that June 29th sees 100% more Brits than average for the interval is one thing - but that Taiwan managed to drop almost a full percent is a little thicker to swallow - unless I was suddenly banned there, or they merged with tachyons and are now doing negative interactions with the physical reality around.

Leaving aside that even if they actually used sound math, and their graphs worked on the basis of the data, which data were consistent, Alexa wouldn't actually provide any sort of knowledge valuevi, for this is all moot : Alexa doesn't even use math, nor is its data consistent in the vaguest of senses.

Wat do, wat do!

———
  1. A meaningless notion, in theory as well as any practical implementation (I have more examples than words). See also this footnote for a historical overview (which is proper, all this "traffic" bullshit is naught but a footnote in the history of the Internet) as well as this for a complete discussion of the economic incentives to perpetuate the fraud/lie/delusion, whatever you prefer calling it. Just as long as you don't pretend it's "business" or "economic growth". []
  2. No need for quotes in this context, as we're discussing something quite exactly like what "traffic" actually is. []
  3. It should be plainly obvious that Alexa has absolutely no fucking clue as to what's going on, and the measurements they profer are necessarily denominated in fantasmagoric units. []
  4. How do you distinguish 424`042 - 422`974 = 1`068 items on the basis of 2 discrete units ? And don't tell me the billion's short : in order for any of this to make sense you'd need trillions of traffics, which isn't fucking happening. []
  5. (−0.395322708−0.571106719−1.133451426−2.105927668)÷4 = −1.05145213, so my 1.1 is closer than every single data point. And no, I didn't fiddle it, just wrote down the first thing that came to mind. []
  6. Because self-reporting is not a source of information ; and because the sort of idiot that'd wear an Alexa toolbar isn't representative for anyone or anything outside the mineral regnum ; and for many other varied reasons []
Category: Meta psihoza
Comments feed : RSS 2.0. Leave your own comment below, or send a trackback.

2 Responses

  1. What's next, article re astrology not actually predicting fates?

    Alexa et al was always a transparent sham - how on earth could anyone imagine that they actually had the pertinent data?

  2. Mircea Popescu`s avatar
    2
    Mircea Popescu 
    Wednesday, 29 June 2016

    Blog de internet pentru navigatori care este!

Add your cents! »
    If this is your first comment, it will wait to be approved. This usually takes a few hours. Subsequent comments are not delayed.