queries: Move /r/all/comments to new query cache (try 3).

The first time we tried to move /r/all/comments to the new query cache,
the row quickly grew to be massive because tombstones were piling up in
the row cache. The row started taking seconds to retrieve after only 12
hours.

We reverted.

We took gc_grace_seconds down to 30 minutes which is relatively safe in
the query cache (prunes can be re-executed without issue and lost
deletes of non-pruned things will be covered by keep_fns).
Additionally, we switched to the leveled compaction strategy for the
relevant column families.

Then we tried this again. This time, things ran fine for three days
before we started seeing out-of-memory issues on the nodes responsible
for this key.  The row size was rather large again.

We reverted.

Now, we're trying again with three more changes, working on the
hypothesis that runaway growth of a hot query can happen because prunes
start failing after a small bad spike.

* tdb_cassandra.max_column_count has been drastically reduced in favor
  of xget for the models that actually need to fetch hugely wide rows.
  This saves memory pressure via materialized thrift buffers in general
  and when the row grows large for whatever reason.

* The pruning behaviour has been tweaked to only prune a portion of the
  extraneous columns if there are a large number.  This should reduce
  the likelihood that prunes will fail after a row has grown too much.

* This query is now in its own column family that is designed to have
  its rowcache disabled.

Why bother shoehorning this query into this data model, you say? It's a
canary for extreme scaling of other queries.  If we can't fix this
problem for this query, we should re-evaluate the whole data model.
This commit is contained in:
Neil Williams
2012-10-10 16:53:59 -07:00
parent 2d2a8d887b
commit 13497f6f08
2 changed files with 24 additions and 8 deletions

View File

@@ -33,7 +33,8 @@ from r2.models.promo import PROMOTE_STATUS, get_promote_srid
from r2.models.query_cache import (cached_query, merged_cached_query,
CachedQuery, CachedQueryMutator,
MergedCachedQuery)
from r2.models.query_cache import UserQueryCache, SubredditQueryCache
from r2.models.query_cache import (UserQueryCache, SubredditQueryCache,
HotQueryCache)
from r2.models.query_cache import ThingTupleComparator
from r2.models.last_modified import LastModified
from r2.lib.utils import SimpleSillyStub
@@ -469,10 +470,10 @@ def user_query(kind, user_id, sort, time):
q._filter(db_times[time])
return make_results(q)
@cached_query(HotQueryCache)
def get_all_comments():
"""the master /comments page"""
q = Comment._query(sort = desc('_date'))
return make_results(q)
return Comment._query(sort=desc('_date'))
def get_sr_comments(sr):
return _get_sr_comments(sr._id)
@@ -839,7 +840,7 @@ def new_comment(comment, inbox_rels):
if comment._deleted:
job_key = "delete_items"
job.append(get_sr_comments(sr))
job.append(get_all_comments())
m.delete(get_all_comments(), [comment])
else:
job_key = "insert_items"
if comment._spam:
@@ -1163,6 +1164,7 @@ def _common_del_ban(things):
def unban(things, insert=True):
query_cache_inserts = []
query_cache_deletes = []
by_srid, srs = _by_srid(things)
@@ -1200,8 +1202,8 @@ def unban(things, insert=True):
query_cache_deletes.append([get_spam_links(sr), links])
if insert and comments:
add_queries([get_all_comments(), get_sr_comments(sr)],
insert_items=comments)
query_cache_inserts.append((get_all_comments(), comments))
add_queries([get_sr_comments(sr)], insert_items=comments)
query_cache_deletes.append([get_spam_comments(sr), comments])
if links:
@@ -1212,6 +1214,9 @@ def unban(things, insert=True):
query_cache_deletes.append([get_spam_filtered_comments(sr), comments])
with CachedQueryMutator() as m:
for q, inserts in query_cache_inserts:
m.insert(q, inserts)
for q, deletes in query_cache_deletes:
m.delete(q, deletes)
@@ -1327,8 +1332,8 @@ def run_new_comments(limit=1000):
fnames = [msg.body for msg in msgs]
comments = Comment._by_fullname(fnames, data=True, return_dict=False)
add_queries([get_all_comments()],
insert_items=comments)
with CachedQueryMutator() as m:
m.insert(get_all_comments(), comments)
bysrid = _by_srid(comments, False)
for srid, sr_comments in bysrid.iteritems():

View File

@@ -554,3 +554,14 @@ class UserQueryCache(_BaseQueryCache):
class SubredditQueryCache(_BaseQueryCache):
"""A query cache column family for subreddit-keyed queries."""
_use_db = True
class HotQueryCache(_BaseQueryCache):
"""A query cache for very hot single-key queries.
Some queries such as all_comments appear to cause rowcache related issues
due to the high volume of writes happening to the single row. This column
family is intended to house such queries. The row cache should be disabled
here to prevent these issues.
"""
_use_db = True