Set GA sample rate through config

reddit uses Google Analytics[0] as a tool to track events on the reddit.com
website, which allows for gathering page load and user event data while
keeping users anonymized. However, with the high volume[1] of traffic
that reddit recieves, the data collection limit[2]-- even with a premium
account-- is often surpassed by a large volume.

Wikpedia states[3] "... sampling is concerned with the selection of a
subset of individuals from within a statistical population to estimate
characteristics of the whole population." We can, using this principle,
send a small portion of user events to Google Analytics collection
endpoints rather than sending the entire data set and achieve a
reasonable approximation of global user behavior without exceeding
reasonable data usage limits as defined by Google Analaytics.

In order to achieve this, the Google Analytics javascript library
provides a method to set a sampling rate[4], a percentage from 1-100.
By calling:

```
_gaq.push(['_setSampleRate', '80']);
```

One can set the sample rate to 80% of users. In reddit's case, I suggest
a default sampling rate of 50%. Here, I have added the `_setSampleRate`
properties to the `_gaq` object created within `utils.html`. It gets its
value from the config, which allows for easy value changes and avoids
using a 'magic value' set multiple places in the code.

[0] - https://www.reddit.com/help/privacypolicy#p_22
[1] - https://www.reddit.com/r/AskReddit/about/traffic
[2] - https://support.google.com/analytics/answer/1070983?hl=en
[3] - http://en.wikipedia.org/wiki/Sampling_(statistics)
[4] -
https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiBasicConfiguration#_gat.GA_Tracker_._setSampleRate
This commit is contained in:
Jack Lawson
2014-10-20 14:28:18 -07:00
parent f69f9d5e08
commit 3aff785a95
2 changed files with 6 additions and 0 deletions

View File

@@ -119,6 +119,9 @@ clicktracker_url = /click
uitracker_url = /pixel/of_discovery.png
# google analytics token
googleanalytics =
# google analytics events sampling rate. Valid values are 1-100.
# See https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiBasicConfiguration#_gat.GA_Tracker_._setSampleRate
googleanalytics_sample_rate = 50
# secret used for signing information on the above tracking pixels
tracking_secret = abcdefghijklmnopqrstuvwxyz0123456789

View File

@@ -406,6 +406,9 @@ ${unsafe(txt)}
['_setCustomVar', 2, 'srpath', '${tracking.get_srpath()}', 3],
['_setCustomVar', 3, 'usertype', user_type, 2],
['_setCustomVar', 4, 'uitype', '${uitype}', 3],
%if g.googleanalytics_sample_rate:
['_setSampleRate', '${g.googleanalytics_sample_rate}'],
%endif
['_trackPageview']
);