David King f63050fb14 Make the mr_top reducer run sort(1) with LC_ALL=C
By default, Linux's sort(1) uses locale-based sorting. Normally this is what
humans want, but for mapreduce it breaks the guarantee that the same reducer
always sees each instance of the same key. Here's an example:

    user/comment/top/week/1102922   26  1453516098.92   t1_cz8jgq9
    user/comment/top/week/1102922   3   1453527927.97   t1_cz8ovzj
    user/comment/top/week/11029224  1   1453662674.45   t1_cza98tb
    user/comment/top/week/1102922   4   1453515976.97   t1_cz8jee8
    user/comment/top/week/1102922   4   1453519790.67   t1_cz8lavb
    user/comment/top/week/11029224  2   1453827188.31   t1_czcotf1
    user/comment/top/week/1102922   7   1453521946.74   t1_cz8mb50
    user/comment/top/week/1102922   7   1453524230.93   t1_cz8ncj2
    user/comment/top/week/1102922   7   1453527760.32   t1_cz8otkx
    user/comment/top/week/1102922   7   1453528700.96   t1_cz8p6u3
    user/comment/top/week/11029228  1   1453285875.44   t1_cz525gu
    user/comment/top/week/11029228  1   1453292202.65   t1_cz53ulm
    user/comment/top/week/11029228  1   1453292232.55   t1_cz53uxe

According to sort(1) using the default locale, this is already sorted.
Unfortunately, that means that to a reducer this list represents 6 different
listings (each of which will overwrite the previous runs of the same listing).
But that's not what we want. It's actually two listings, like:

    user/comment/top/week/1102922   26  1453516098.92   t1_cz8jgq9
    user/comment/top/week/1102922   3   1453527927.97   t1_cz8ovzj
    user/comment/top/week/1102922   4   1453515976.97   t1_cz8jee8
    user/comment/top/week/1102922   4   1453519790.67   t1_cz8lavb
    user/comment/top/week/1102922   7   1453521946.74   t1_cz8mb50
    user/comment/top/week/1102922   7   1453524230.93   t1_cz8ncj2
    user/comment/top/week/1102922   7   1453527760.32   t1_cz8otkx
    user/comment/top/week/1102922   7   1453528700.96   t1_cz8p6u3
    user/comment/top/week/11029224  1   1453662674.45   t1_cza98tb
    user/comment/top/week/11029224  2   1453827188.31   t1_czcotf1
    user/comment/top/week/11029228  1   1453285875.44   t1_cz525gu
    user/comment/top/week/11029228  1   1453292202.65   t1_cz53ulm
    user/comment/top/week/11029228  1   1453292232.55   t1_cz53uxe

To do this, we need to set the enviroment variable LC_ALL=C when running sort(1)
to indicate that the sorting should operate only on the raw bytes.

It looks like this has been broken since the Trusty Tahr upgrade.
2016-02-18 15:35:02 -08:00
2015-05-13 12:47:41 -07:00
2016-01-06 10:18:58 -08:00
2015-12-07 10:24:25 -08:00
2016-01-14 15:50:47 -08:00
2015-01-08 13:35:03 -08:00
2015-09-24 12:00:28 -07:00

reddit

Greetings!

This is the primary codebase that powers reddit.com.

For notices about major changes and general discussion of reddit development, subscribe to the /r/redditdev and /r/changelog subreddits.

You can also chat with us via IRC in #reddit-dev on freenode.


Quickstart

To set up your own instance of reddit to develop with, we have a handy install script for Ubuntu that will automatically install and configure most of the stack.

Alternatively, refer to our Install Guide for instructions on setting up reddit from scratch. Many frequently asked questions regarding local reddit installs are covered in our FAQ.

APIs

To learn more about reddit's API, check out our automated API documentation and the API wiki page. Please use a unique User-Agent string and take care to abide by our API rules.

Happy hacking!

Issues and Contribution Guidelines

Thanks for wanting to help make reddit better! First things first, though: github issues is only for confirmed, active bugs. Please submit ideas to /r/ideasfortheadmins.

Please read more on contributions in CONTRIBUTING.md.

Description
No description provided
Readme 55 MiB
Languages
Python 54.4%
JavaScript 26.7%
HTML 11.3%
Less 4.5%
Shell 1%
Other 2%