mirror of
https://github.com/reddit-archive/reddit.git
synced 2026-04-27 03:00:12 -04:00
If you go to a userpage and sort by top (in either the overview or comments tabs), and restrict the time range to anything other than "all time", no comments will be shown. The data in these listings is built from functions in `lib/db/queries.py` (specifically from `get_comments()` down). This ends up trying to pull the query results from permacache (in `CachedResults.fetch_multi()`), defaulting to an empty list if no cache entry is found. Now, the cache entry is supposed to be populated periodically by a cronjob that calls `scripts/compute_time_listings`. This script (and its Python helpers in `lib/mr_top.py` and `lib/mr_tools/`) generates a dump of data from Postgresql, then reads through that and builds up entries to insert into the cache. As with many scripts of this sort, it expects to get in some bad data, and so performs some basic sanity checks. The problem is that the sanity checks have been throwing out all comments. With no new comments, there's nothing new to put into the cache! The root of this was a refactoring in reddit/reddit@3511b08 that combined several different scripts that were doing similar things. Unfortunately, we ended up requiring the `url` field on comments, which doesn't exist because, well, comments aren't links. Now we have two sets of fields that we expect to get, one for comments and one for links, and all is good. We also now have a one-line summary of processed/skipped entries printed out, which will help to make a problem like this more obvious in the future.