Files
reddit/scripts/indextank_backfill.py
KeyserSosa 9a4271f641 Upgrade Instructions
====
   * Uninstall the python Cassandra package that we previously depended on (it has a namespace conflict with the new package we depend on).  To find it:

     $ python -c "import cassandra; print cassandra.__file__"

    and rm -r the .egg directory after "site-packages/"

   * This version relies on cython, so if "make" fails, you may have to install cython via your distro's package manager.

     $ cd reddit/r2
     $ python setup.py develop # possibly with "sudo" depending on your install
     $ make

   * Cassandra is now required for the caching layer.  An example storage-conf.xml can be found in reddit/srv/cassandra.  Make sure that the additional <Keyspace> items are included in your conf file.

   * remove the query_queue_reader services if they are running.  add new gen_time_listings.sh instead.  Suggested cron:

    0    */2 *   *   *   $SCRIPTS/gen_time_listings.sh year '("month","year")'
    */5  *   *   *   *   $SCRIPTS/gen_time_listings.sh week '("day","week")'
    *    *   *   *   *   $SCRIPTS/gen_time_listings.sh hour '("hour",)'

   where $SCRIPTS is the location of this script directory

Features and Bugfixes
====
 * Mobile reddit:
   * templates are in r2/templates of the form *.compact
   * css is r2/public/static/css/compact.css
     * beginning of a sass-based (http://sass-lang.com/) compact.scss
   * reachable via .compact extension or from the "i" subdomain.
 * Cassandra is now *required*, and votes are currently written out to both cassandra and postgres (part of an eventual migration).
 * attempt to make the db connection code a little smarter.
   * A dropped DB connection will mark the connection dead and randomly attempt to reconnect.
   * A dropped db connection on start will permanently mark the connection as dead.
 * Calculate the time-filtered top/controversy listings using mapreduce instead of prec_links (new cron job in reddit/scripts)
 * allow default user/pass for database to be specified with '*' to fallback on db_user and db_pass in the INI file
 * Search feedback buttons
 * make deleted comments not show up in your inbox.
 * move last_visited into cassandra
 * Swallow rare, race-conditiony POST_save/hide/subscribe problems
 * Apparently we haven't been breaking properly for the past few weeks.
2010-06-16 17:01:50 -07:00

58 lines
2.3 KiB
Python

# The contents of this file are subject to the Common Public Attribution
# License Version 1.0. (the "License"); you may not use this file except in
# compliance with the License. You may obtain a copy of the License at
# http://code.reddit.com/LICENSE. The License is based on the Mozilla Public
# License Version 1.1, but Sections 14 and 15 have been added to cover use of
# software over a computer network and provide for limited attribution for the
# Original Developer. In addition, Exhibit A has been modified to be consistent
# with Exhibit B.
#
# Software distributed under the License is distributed on an "AS IS" basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for
# the specific language governing rights and limitations under the License.
#
# The Original Code is Reddit.
#
# The Original Developer is the Initial Developer. The Initial Developer of the
# Original Code is CondeNet, Inc.
#
# All portions of the code written by CondeNet are Copyright (c) 2006-2010
# CondeNet, Inc. All Rights Reserved.
################################################################################
from r2.models import Account, Link, Comment, Trial, Vote, SaveHide
from r2.lib import amqp
from time import sleep
from r2.lib.db.operators import asc, desc
from pylons import g
def run(verbose=True, sleep_time = 60, num_items = 1):
key = "indextank_cursor"
cursor = g.cache.get(key)
if cursor is None:
raise ValueError("%s is not set!" % key)
cursor = int(cursor)
while True:
if verbose:
print "Looking for %d items with _id < %d" % (num_items, cursor)
q = Link._query(sort = desc('_id'),
limit = num_items)
q._after(Link._byID(cursor))
last_date = None
for item in q:
cursor = item._id
last_date = item._date
amqp.add_item('indextank_changes', item._fullname,
message_id = item._fullname,
delivery_mode = amqp.DELIVERY_TRANSIENT)
g.cache.set(key, cursor)
if verbose:
if last_date:
last_date = last_date.strftime("%Y-%m-%d")
print ("Just enqueued %d items. New cursor=%s (%s). Sleeping %d seconds."
% (num_items, cursor, last_date, sleep_time))
sleep(sleep_time)