Files
reddit/r2/supervise_watcher.py
KeyserSosa 9a4271f641 Upgrade Instructions
====
   * Uninstall the python Cassandra package that we previously depended on (it has a namespace conflict with the new package we depend on).  To find it:

     $ python -c "import cassandra; print cassandra.__file__"

    and rm -r the .egg directory after "site-packages/"

   * This version relies on cython, so if "make" fails, you may have to install cython via your distro's package manager.

     $ cd reddit/r2
     $ python setup.py develop # possibly with "sudo" depending on your install
     $ make

   * Cassandra is now required for the caching layer.  An example storage-conf.xml can be found in reddit/srv/cassandra.  Make sure that the additional <Keyspace> items are included in your conf file.

   * remove the query_queue_reader services if they are running.  add new gen_time_listings.sh instead.  Suggested cron:

    0    */2 *   *   *   $SCRIPTS/gen_time_listings.sh year '("month","year")'
    */5  *   *   *   *   $SCRIPTS/gen_time_listings.sh week '("day","week")'
    *    *   *   *   *   $SCRIPTS/gen_time_listings.sh hour '("hour",)'

   where $SCRIPTS is the location of this script directory

Features and Bugfixes
====
 * Mobile reddit:
   * templates are in r2/templates of the form *.compact
   * css is r2/public/static/css/compact.css
     * beginning of a sass-based (http://sass-lang.com/) compact.scss
   * reachable via .compact extension or from the "i" subdomain.
 * Cassandra is now *required*, and votes are currently written out to both cassandra and postgres (part of an eventual migration).
 * attempt to make the db connection code a little smarter.
   * A dropped DB connection will mark the connection dead and randomly attempt to reconnect.
   * A dropped db connection on start will permanently mark the connection as dead.
 * Calculate the time-filtered top/controversy listings using mapreduce instead of prec_links (new cron job in reddit/scripts)
 * allow default user/pass for database to be specified with '*' to fallback on db_user and db_pass in the INI file
 * Search feedback buttons
 * make deleted comments not show up in your inbox.
 * move last_visited into cassandra
 * Swallow rare, race-conditiony POST_save/hide/subscribe problems
 * Apparently we haven't been breaking properly for the past few weeks.
2010-06-16 17:01:50 -07:00

113 lines
4.9 KiB
Python

# The contents of this file are subject to the Common Public Attribution
# License Version 1.0. (the "License"); you may not use this file except in
# compliance with the License. You may obtain a copy of the License at
# http://code.reddit.com/LICENSE. The License is based on the Mozilla Public
# License Version 1.1, but Sections 14 and 15 have been added to cover use of
# software over a computer network and provide for limited attribution for the
# Original Developer. In addition, Exhibit A has been modified to be consistent
# with Exhibit B.
#
# Software distributed under the License is distributed on an "AS IS" basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for
# the specific language governing rights and limitations under the License.
#
# The Original Code is Reddit.
#
# The Original Developer is the Initial Developer. The Initial Developer of the
# Original Code is CondeNet, Inc.
#
# All portions of the code written by CondeNet are Copyright (c) 2006-2010
# CondeNet, Inc. All Rights Reserved.
################################################################################
import os, re, sys, time, smtplib
from r2.lib.services import AppServiceMonitor
from pylons import g
nerds_email = g.nerds_email
smtp_server = g.smtp_server
def Alert(restart_list = ['MEM','CPU'],
alert_recipients = [nerds_email],
alert_sender = nerds_email,
cpu_limit = 99, mem_limit = 5,
smtpserver = smtp_server, test = False):
p = re.compile("newreddit(\d+)")
cache_key = 'already_alerted_'
for host in AppServiceMonitor(g.monitored_servers):
for service in host:
# cpu values
cpu = [service.cpu(x) for x in (0, 5, 60, 300)]
output = "\nCPU: " + ' '.join("%6.2f%%" % x for x in cpu)
output += "\nMEMORY: %6.2f%%" % service.mem()
service_name = "%s %s" % (host.host, service.name)
# is this service pegged?
mem_pegged = ('MEM' in restart_list and service.mem() > mem_limit)
need_restart = (('CPU' in restart_list and
all(x >= cpu_limit for x in cpu)) or mem_pegged)
if (need_restart):
mesg = ("To: " + ', '.join(alert_recipients) +
"\nSubject: " + service_name +" needs attention\n\n"
+ service_name
+ (" is out of mem: " if mem_pegged else " is pegged:" )
+ output)
m = p.match(service.name)
# If we can restart this process, we do it here
if m:
proc_number = str(m.groups()[0])
cmd = "/usr/local/bin/push -h " + \
host.host + " -r " + proc_number
if test:
print ("would have restarted the app with command '%s'"
% cmd)
else:
result = os.popen3(cmd)[2].read()
# We override the other message to show we restarted it
mesg = ("To: nerds@gmail.com\n" +
"Subject: " + "Process " +
proc_number + " on " + host.host +
" was automatically restarted " +
"due to the following:\n\n" +
output + "\n\n" +
"Here was the output:\n" + result)
# Uncomment this to disable restart messages
#mesg = ""
last_alerted = g.rendercache.get(cache_key + service_name) or 0
#last_alerted = 0
if mesg is not "":
if test:
print "would have sent email\n '%s'" % mesg
elif (time.time() - last_alerted > 300):
g.rendercache.set(cache_key + service_name, time.time())
session = smtplib.SMTP(smtpserver)
smtpresult = session.sendmail(alert_sender,
alert_recipients,
mesg)
session.quit()
def Run(srvname, *a, **kw):
args = {}
if kw.has_key('queue_length_max'):
args['queue_length_max'] = kw.pop('queue_length_max')
AppServiceMonitor(**args).monitor(srvname, *a, **kw)
def Test(num, load = 1., pid = 0):
services = Services()
for i in xrange(num):
name = 'testproc' + str(i)
p = i or pid
services.add(name, p, "10")
services.track(p, 100. * (i+1) / (num),
20. * (i+1) / num, 1.)
services.load = load
services.set_cache()
if __name__ == '__main__':
Run(sys.argv[1:] if sys.argv[1:] else ['newreddit'])