Link.tracking_link adds utm query parameters onto links inside reddit
to track which button was clicked, what component type contains the button
(e.g. post listing, inbox, post listing embedded on a comments page),
what page type the user is visiting, and the page's subreddit.
This feature is enabled via feature flag and disabled by admin mode.
This commit affects programmatically-generated links to comments and messages.
For domains listings, it was possible to have results from the wrong domain be included. Switching to a phrase search fixes that. See https://redd.it/5ew4ve for more info about, and caveats of, this method.
For AllMinus, see https://redd.it/5eqiyy for the general context. This does not need to be explicitly handled for ModMinus, as ModMinus excludes the filtered subreddits from its sr_ids property. Additionally, ModMinus doesn't need special handling since it inherits the already-handled MultiReddit.
The most common, and I really mean it, the most common post in /r/help and related subreddits is when users question why they can't make a subreddit even though they have an old account. The error that users get only mentions account age and does not mention any kind of karma requirement. It's in the /r/help FAQ, but we all know no one reads that.
Just lightly mentioning karma could reduce the traffic on /r/help as well as be more clear.
These were original to prevent users from buying ads when
managed campaign page takeovers were in effect. With auction this
no longer matters since impressions are no longer guaranteed.
More recently it seems roadblocks were used to prevent people from
buying ads in places that were inappropriate. We now have other
ways of dealing with that (`subreddit.hide_sponsored_headlines and
`subreddit.allow_ads`) so these pages are completely obsolete.
Previously the messages in the non-fastlane queue were dropped so some
older comments would not be added to the CommentTree. There may be some
lock contention with multiple queue consumers processing updates for the
fastlaned links, but it shouldn't last very long.
This was used to gradually ramp up reads of the precomputed comment
orders. We've been running for a while with this set to always read,
so the setting can be removed.
This will let us get a sense of how much work is actually done. I'm looking
at splitting the CommentOrderer update out into a separate queue and need
to understand how many writes actually happen.
This warning was no longer true--any missing scores are automatically
calculated and updated.
We actually have the opposite issue--the CommentTree must be updated
before writing scores because the QA score reads it.
Score updates are processed through commentstree_q. When a new comment
is created an automatic initial vote (by the comment's author) is created.
This results in two messages in commentstree_q: one from the vote and one
from queries.new_comments. Don't create the message from the vote because it
is redundant. This will let us reduce the volume of messages in commentstree_q
which is currently very high.
Instead of checking _featurestate_cache for a key's existence and then
retrieving it, just get it and then check for a miss. Doing the two step
process can result in a KeyError if _featurestate_cache is cleared between
the existence check and the retrieval.
"email", "authorize", "hc", and "traffic" databases aren't used that
often and maybe we can reduce the number of connections to pg-05 by
waiting to establish a connection until it's actually needed.
The fastlane processor was meant to handle votes on both Links and Comments,
but it can't do that easily anymore now that the vote processing has been
split. It's not a big deal now because Link vote processing is much faster
now that the query updating has been separated and sharded. The Comment vote
consumer/queue was getting some benefit from the fastlane and it can be
resurrected if we run into problems.
Change the error from a 500 to a line in the error log with more info about what/why it failed.
This seems to be the result of someone's crappy votebot.
The queue can be sharded by domain to minimize lock contention
and the consumer will batch updates to the same links (e.g. several
votes for the same link) and to the same domain (e.g. votes for different
links to a single domain).
The queue can be sharded by subreddit id to minimize lock contention
and the consumer will batch updates to the same links (e.g. several
votes for the same link) and to the same subreddit (e.g. votes for different
links submitted to a single subreddit).
The queue can be sharded by author id to minimize lock contention
and the consumer will batch updates to the same links (e.g. several
votes for the same link) and to the same author (e.g. votes for different
links submitted by a single author).
The raven client inspects the traceback and attempts to figure out which
parts belong to the app and which belong to external libraries. It uses
a whitelist of paths to identify application code. Previously we had been
using the list of repository names, but that was incorrect because the
plugin "liveupdate" is actually called "reddit_liveupdate" in the traceback.
The whitelist was also incomplete because it didn't account for scripts run with
paster run which can have a path like /opt/something/script.py.
This creates a sys.excepthook handler that reports any exceptions
to Sentry. This results in double reporting errors when in script
mode because the exception is re-raised and caught by that handler.
We can conduct experiments that impact how pages are rendered across
users, bucketing pages according to the fullname, so that search engines
will crawl and index the same experimental content that users see. We
support subreddit listings pages, comments pages, and comment permalink
pages. We use the link fullname for both comments pages and comment
permalink pages, so that they are bucketed together.
We don't want to spend crawl budget or rank on what are essentially
duplicate pages. In case we have inbound links to these pages, we don't
want the robots.txt to prevent crawlers from accessing them.