* Use keep-alive http agent to reuse open connections
* Decrease max batch size and flush interval to payload sent is smaller
* Add debug logging for fail/success/retry events on flush
* Use "modActions" authorIs filtering to check note prior to adding note using "existingNoteCheck" property
* Refactors concept of "allowDuplicates" to allow any arbitrary modActions test to be used
* Provide convenience ModLogCriteria generation for "existingNoteCheck" based on boolean (emulates allowDuplicates functionality)
* Implement filtering activityType by "false" in order to return actions/notes not added to a specific activity
* Implement "referencesCurrentActivity" property to allow filtering by actions/notes that are associated with the activity being processed
* Implement using "count" for "current" search to enable criteria condition based on presence or non-presence of current action/note passing
* Tie loading indicator to live stream status and display error if one occurs
* Add manual restart action to end of error
* Restart stream automatically if reader ends, up to 3 retries
* Implement a DTO class for activity source to make parts usage (type, identifier) and matching easier
* Implement regex to parse type and identifier from activity source string
* Refactor activity source interface/types to better distinguish as string, data, and class
* Add mhs rule type and MHS credentials interface
* Implement MHS rule with similar criteria options to sentiment
* Allow testing against author history content
* Refactor write/read into separate functions
* Improve error hinting for wiki read/write/permissions WRT mod/oauth permissions
* Remove superfluous error wrapping to reduce logging length for wiki errors
* More debug logging for onboarding process
* Don't return error if manager fails to parse after all onboarding complete (not critical)
* run initHeartbeat on any GET route that render a page so that user doesn't get access denied on initial app load
* Force client refresh if no invite found on initial check for onboarding landing
Reddit returns 403 if the subreddit exists but is private. Using this function wraps the error so we can just get boolean back along with subreddit object, if successful
* Add "default" hint to force val to a url or wiki key if neither is detected but know it should be one of them
* Refactor wiki/url fetching into own functions for better reuseability
* Implement mod permission getter function to check for valid permissions on wiki page error
* Improve error hints on wiki page read failure
* May be a string or array of strings. Passes if any expression matches
* Value may be a convience day-of-week value (mon, tues, wed...)
* or a cron expression
* Refactor some manager/bot methods to be more accessible for invites/wiki
* Add routes and authentication for getting invite information and checking user is moderator of subreddit
* Add route for accept invite and completing onboarding guest/config
* Refactor to use db instead of cache for persisting invites
* Implement subreddit invite helper page
* Add initial config and guests as optional data for invite
* Refactor bot to use db subreddit invite and auto-accept when no config/guests
TODO: subreddit accept page, mod authorization, initial config usage, and documentation
* Refactor/remove websockets functionality for relaying opstats from server with direct polling by client
* Implement delta responses initially introduced in #91 to reduce bandwidth
since 'to' be now be templated a user can configure a message to send to the subreddit the action is being processed from using `to: 'r/{{item.subreddit}}'`
* Add top-level 'actionSummary' template variable that renders a markdown list of action results
* Add individual action result/data, in the same structure as rules, under the top-level 'actions' template variable
* Implement interfaces for template parts
* Refactor Action constructor to take an object for runtime options (cleaner, more extensible)
* Refactor subreddit resource and snoowraputils content rendering and organization to use objects of optional data rather than required arguments
* Make almost all data optional and only parse/render if included
* Move rule results parsing/formatting into own function
* Refactor footer render to use renderContent (DRY)
* Add some requested template data #104
* {{item.subreddit}} #87
* {{check}} #87
* Updated templating documentation
* Create self/link submissions, no reddit media (upload) yet
* Submission can use user modifiers (nsfw, spoiler) and mod modifiers (sticky, distinguish, lock)
* Can create submission in current subreddit (default) or arbitrary subreddit defined in configuration (targets)
* Can create multiple submissions by defining multiple targets
* Check for no/null guest data during invite creation/usage for both client and server side
* Add instances tab on invite page
* Replace instance select on invite page with query string usage
* Redirect from status page to auth helper when instance has no bots
* Rename to Guest since this is more accurate of a description -- "mod" is confusing since user doesn't have any actual mod power
* Add guests as an option to invite creation and display on invite page
It makes more sense for the CM instance that will actually have a bot added to it to own the invite for that bot.
* (BC) Move Invite into server entity mappings and rename to BotInvite
* Add guests and initialConfig (future use)
* BC -- Existing invites need to have instance defined to be used
* BC -- Cache-based invite storage has been REMOVED
* BC -- Removed inviteMaxAge config property (for now)
* Add SubredditInvite entity for future use
* Force/implement server (api) having a defined friendly name. This is used to determine which invites in a DB belong to which server instance
* If not defined in config it is generated at random and then written to config
* Refactor how invites are retrieved and parsed client-side -- CRUD using server api
* BC -- Invite URL structure has changed from ?invite=id to /invite/id...
* Added better UI for migration redirect
* Show all reachable instances in redirect page header
* Error page also shows reachable instances in page header
* Cache subreddit removal reasons and display in popup helper in authenticated config editor
* Add fields for removal reason note/reason in remove action
* Update remove action documentation to include new fields
Using undocumented endpoints pulled from praw documentation/code. Snoowrap can now:
* add removal reason/mod note on a removed activity
* get subreddit removal reasons (used for getting ids for use with removal reason endpoint)
* Trim image to remove arbitrary borders
* Convert image to greyscale in order to reduce effect of saturation differences
* Implement "mirrored" (y-axis flip) hash calculations
* Implement local file fetch and image processing test suite
* Use BotInstance class on client side for easier state tracking and simplifying ACL functions (moved from user to instance)
* Implement and use same interface for client/server Bot representations
* Implement mod/guest context for status and live stats data
* Restrict guest mod CRUD to mods only
Stop or restart real-time data (logs, stats) based on page visibility API so that client does continue to consume data when page is in the background/in a non-visible tab
* Document memory management approaches
* Change node args for docker to a better name (NODE_ARGS)
* Implement default node arg for docker `--max_old_space_size=512`
* Include reddit thing id as 'id'
* Include 'title' -- for submission this is submission title. For comment this is the first 50 characters of the comment truncated with '...'
* Include 'shortTitle' -- same as above but truncated to 15 characters
* More granular delta for nested objects
* Custom delta structure for delayedItems
* Fix abort controller overwrite
* Enforce maximum of two unfocused log streams before cancelling immediately on visible change to reduce concurrent number of requests from browser
* Refactor time parsing and comparison utils to consolidate logic and be more flexible
* Tests for the above
* Fix indexes for ActivityReport (should be unique)
* Fix missing eager load for Activity-Report relationship
* Implement time filtering for reports in itemIs
Snoowrap doesn't add fetch property to objects from listings so even though subreddit may already be fetched it always re-fetches, using an api call. Do a dirty fetch check here to prevent wasting calls when we are sure the subreddit is fetched.
* Potentially fix#30
* Break out api sampling and output into own function
* Add influx metrics for used calls and sample faster
* Fix default tags inheritance
* Refactor config fragment parsing function to always return an array for simpler usage in hydrate function
* Add error handling to all fragment calls so user can get a clear description of which fragment failed, contextually
* Fix incorrect variable used for cache key that prevented cache from working at all (oops)
* Don't add subreddit context to ext url cache key so it can be re-used from any subreddit
* Refactor manager build/init from bot to be independent of bot init process
* Remove/destroy managers for subreddits no longer moderated by bot account or not in operator list
* Add/create managers for subreddits that should be moderated
* Run manager build function during heartbeat to sync managers
* Improve action criteria to FullCriteria function
* Don't include undefined properties
* Iterate entries with switch to simplify property transformations
* Fix mod action test switch case matching to be case-insensitive (same as key)
* Fix missing/bad assignments for filtering mod actions
* Fix typo usage of foundNoteResult in modActions case block
* Throw error if mod action criteria isn't recognized as note/log instead of silently falling back to log
* Test props in order of least likely to use an API call
* Enables simplifying shadowbanned test and allows testing for more properties on shadowbanned user
* Fix existing bot removal
* Return response to client after testing client rather than after managers build to avoid long response time if bot has many subreddits
* Normalize (depopulate from snoowrap) mod note raw data so it can be constructed agnostic of source (cache or api)
* Implement cache GET for modnotes with default TTL of 60 seconds
* Refactor mod note action and implement cache PUT when new notes are added
I was under the impression primary keys were always indexed but that is not the case, at least for postgres. This migration explicitly creates uniques indexes for all tables that use random ids and adds other indexes to filter/premise/result tables on other FK strings. Improves event retrieval timing dramatically.
* Implement separate language detection functionality
* Clearer/simpler sentiment processing
* Add languageHints to help coerce low confidence language detection
* Add test cases for lang detection, sentiment detection, and sentiment tests
* Fix neutral range -- was not using normalized score range
* Build NLP container ad-hoc so only supported languages are included from npm
* Use vader/wink as heuristics for detecting language when content is very short
* Add languageHint option for sentiment config to make coercing a confident sentiment easier
* Refactor lang processing to fail the sentiment test rather than throwing an error when language is not support/not confident -- provides more insight into outcome
typeorm depends on ts-node as an *optional peer* dependency -- ts-node can be used in the typeorm cli to parse entities and run migrations.
However CM doesn't use typeorm CLI for running production so its not needed. And ts-node depends on typescript so npm install --production always installs both at about ~30MB.
I couldn't find a good way to remove peer deps ONLY for typeorm so instead just manually remove the folders from the prod install in the final layer of the docker.
* Refactor to use interval in browser to call to api proxy endpoint and get live stats directly instead of using websockets. Generally simplifies things.
* Remove empty/superfluous data from cache stats returned for live data
Prevents CM from iterating through n+1 pages of polling sources (mostly unmoderated) due to a source-of-truth change
See comments for scenario this helps avoid
* Get streamed logs directly from api (through proxy) in browser using streaming apis instead of through client websockets
* Use observer visibility to determine which logs to stream
* Timeout and abort any streaming logs if tab hasn't been visible for more than 15 seconds
TODO streaming system logs
* Use browserify to include logform functions and triple-beam symbols in client js
* Implement default log transform function in client js
* Remove formatted message and transport data from non-streaming log data sent to client
* Fix missing assignment for filtered activities after removing activity
* Remove 'processing' state from delayed activity lifecycle
* Allows delayed queue to immediately remove activity after pushing to firehose -- simplifies lifecycle since another function (queue) doesn't have to handle this
* Removing delayed activity function call from queue logic reduces calls to database
* Likelihood an activity is cancelled while also processing is small (i hope...)
* Re-order item refresh logic in activity handling so delayed items are fetched before any proxy properties are accessed
* Catch errors on adding delayed activities from DB and just log -- not essential function
Use 'name' since this should *always* be present on a submission/comment -- when snoowrap creates an empty proxy it only includes 'name'.
* Potentially fixes facet of #64
* Fixes shouldRefresh undefined when activity is from delayed activities in database (empty proxy)
* Refactor from async while depending on queue state into interval that always runs and just checks paused status of queue -- eliminates need to restart while loop if queue state is not running
* Add canary debug output for delayed activities to be able to know if it is actually running
* Use duration field as SECONDS and remove additional time-based field (not necessary)
* Refactor dayjs usage in UI to parse action delay duration and correctly display "time until dispatch" to show if duration is negative
* Implement english-only scoring with wink https://github.com/winkjs/wink-sentiment (AFINN, emojis)
* Implement english-only scoring with NLP.js https://github.com/axa-group/nlp.js/blob/master/docs/v3/sentiment-analysis.md (AFINN, Senticon, Pattern)
* Refactor language processing into standalone functions for future use
* Add limited multi-langauge support
* Can run sentiment with NLP.js on english, german, spanish, and french
* Normalize all scores to range between -1 and +1
* Improve score accuracy by averaging all scores
* Add deprecation warnings to rules when building if properties should be migrated to window
* Add `debug` option to window to increase verbosity of filter logging. Default to false.
* Fix object assigment when building hash key for window filters
* Further cleanup for circular dependencies by moving some filter and logging functions into respective files
* Fix list function passed from author activities convenience method
* Move author history caching into main activity fetching function
* Do a better job at rehydrating snoowrap objects from cache data -- set as fetched, substitute relationships for non-fetching objects, and remove listing related objects
* Cache key for results based on window and pre-filter only -- post filter can be done after fetching cached results (Should save api calls!)
* Move some interfaces and types into own files to breakup huge interfaces file
* Refactor window shape for config and "full" usage in app to support subreddit/item filtering
* Refactor author activities into resources class so we can take advantage of caching on subreddit/item filter results
* subreddit filtering on window can now use include or exclude
* subreddit resources uses batch/cache retrieval
* temp fix to keep string subreddit name parity check in recent activity self inclusion logic
Fixes scenario where a dispatched activity does not inherit DR state from currently processing activity
* Add dryrun state to dispatch activity data in app and database
* Use general DR state for dispatched activity rather than DR for dispatched action
* Defer to explicitly defined DR in Task data when manager/queue handles task
* Implement a "not serious" property for these errors so we don't report as an error to manager since they not an actual problem with the api or CM
* Check for [deleted] user name before trying to fetch an author
* Break out documentation into more standalone docs and reorganize into an operator folder
* Remove outdated information on adding bot
* Add additional information on docker install
* Make configuration more opinionated for "recommend" approach
* Add docs on database and caching
* Rewrite operator getting started guide to be more concise
* user-configurable retention period (number of events OR duration) at operator, bot, subreddit override, and subreddit config level
* run database cleanup using retention policy on startup and every 30 minutes
* show retention policy in UI on manager overview
* Extends postBehavior interface to allow specifying different record output options (database, influx)
* Can specify for *either* post behavior which enables storing events that were not triggered
Instead of "skipping" the rule will now fail. This aligns Rule behavior with how filters work through the rest of CM which should reduce cognitive load and development effort.
If the skipping behavior is still desired a user can use a RuleSet with OR condition to achieve the same effect.
* Collect same stats as all time but on at a specified frequency
* Frequency is configurable at operator, bot, and subreddit level
* Operator and bot level frequency can have an enforced minimum
* Query for non-hydrated events to get ids then get fully hydrated objects using only ids -- dramatically improves performance
* Remove typeorm-pagination due to non-optimized count/select approach (should use typeorm getManyAndCount)
* Also removes dup typeorm dependency
* Make title a link to "default" events view
* Fix event link
* Always return first page when fetching events by permalink to reset any existing pagination state
When reference submission is a self post identifier may vary slightly since it is considering both title and body. Use string matching on identifies to find "close" matches for reference submission
* Separate rule results from check result ownership since results can be re-used
* Implement ruleset data structure for better representation of results
* Add ruleset assoication to check result
* Partially refactor check caching (TODO)
* Same behavior/structure as authorIs
* Refactor interfaces to use a generic FilterOption type
* Allow simplified data for item/author filters -- when prop data is an array of criteria treat as "include OR"
* Switching to lsio as base means inclusion of s6-overlay for improved init, handling of default permissions, and a maintained base for security/bug fixes
* Also means we can use PUID and PGID as variables instead of requiring --user in run command (more friendly to unraid, portainer, etc...)
* BREAKING: mount directory for configuration,etc. has been changed to /config by default to align with lsio
* Clear npm cache after final production install to reclaim 100MB on image size
* More comprehensive dockerignore to prevent accidentally including development files
* Use correct property from entity metadata which has all real db prefixes added (tablePath)
* Look for specified metadata table name instead of generic 'migrations'
* Simplify conditions (DRY)
Don't rely on default sort chosen by reddit as comment/overview are sorted by new but submitted is sorted by hot (why??). Make sure requests always sort by new to prevent inconsistencies.
To prevent any accidental referenced object property re-assignment only use subreddit name to construct subreddit for usernotes. Want to make sure notes don't get read/written to different subs by mistake...this shouldn't be happening but just to be double extra sure.
Since web DB needs to be built separately from server to allow migrations anyway might as well provide *all* the flexibility
* Reverts breaking change removing caching from web
* Add database config override in web config
To make persistence align with client-server architecture
* Create separate datasource instances for client and server
* Refactor migration logic into service so code can be reused between client/server
* Implement endpoints for backup/migrate on client
* Move migrations into separate folders and move client sessions, invites, and web settings into own migration
* Defer session secret generation until client instantiation and save to database when possible
* BREAKING: Replace web.caching with web.storage and restrict options to specifying either top-level database or cache
* Simplifies init but provides options for different use cases
* Default to database
* Implement generic storage provider for web persistence. Uses either cache or database
* Refactor session/invites to use storage provider
* Implement invite entity for database
* Allow specifying different storage provider for sessions
* Remove action/rule as standalone entities because they depend too much on premise data to make sense (not normalized). Instead, consolidate into premise entities.
* Breakout premise config into config/itemIs/authorIs data to make future querying easier
* Improve errors when checking if file is readable or directory/file is writable
* Implement custom winston-typeorm logging mappings so migrations are output at INFO level
* Add typeorm logging options to database config and default to displaying error, warning, and migrations
* Ensure directory is writeable before using log rotation transport for winston
* Move init logger into index so its always available
* Simplify printing stack for SimpleError when it's a cause
* Better handling for access permissions when reading operator config file
* Add hint to errors relating to permissions access if app detects its running in docker
* Change project location to /home/node to avoid root permissions when creating WORKDIR
* Create default config directory with correct permissions in the event no volume is mounted
* Set default db driver to better-sqlite3 since we control environment -- it has better performance and we know we can use a pre-built binary b/c of base docker image
* Add DATA_DIR env to allow specifying a "base directory" for all other config paths. Defaults to project root (same as existing behavior)
* OPERATOR_ENV OPERATOR_CONFIG LOG_DIR and database locations can now be specified as absolute OR relative (from DATA_DIR)
* Look for config.yaml and config.json if no OPERATOR_CONFIG is present
* millisecond timestamp using bigint doesn't translate well between databases typeorm/typeorm#2400 so avoid this
* regular int/unix epoch doesn't include milliseconds which we need for ordering results and required resolution
* sqlite and mysql/mariadb support datetime with millisecond precision so settle for that, for now
* Persist events, queue, and manager states to database on change
* Use invokee/state from database on startup to determine if manager should be run after building
* Refactor database init from backup and migration attempt so they can be invoked separately
* Implement more generic db backup function to make additional backup strategies possible in the future
* Add operator config option to try migrations if database backup succeeds
* Implement client web migration runner and database backup flows
* Determine what actual crit object is based on object shape (could be from app or api)
* Initialize filters are undefined so they overwrite null values from api when transforming for ui
* Simplify/refactor stats to only track all-time in database...
* remove "last reload" stats from app and ui
* simplify tracked stats to counts only (no maps-of-types counts)
* Add migration function in stat init to convert any all-time stats in cache to database, when database is empty
* Implement a generic activity source to track where an event was retrieved from and some initial parameters
* Consolidate dispatch action data into generic source
* Refactor delayUtil logic for event handling to prevent blocking worker for a non-trivial amount of time by dispatching event
* Implement own createdAt functionality using hooks and dayjs so we can use unix timestamp to eliminate TZ ambiguity and increase granularity to milliseconds
* hook insert/load to convert timestamp between dayjs and int
* Set value on class instantiation rather than object save to preserve actual creation time
* CM data is highly relational so it doesn't make sense to support both rdbms and document dbs
* repository usage is different for mongo, not supporting both
* Use setter/getter to parse id/name to make it easier to use fullname or short id interchangeably
* rename title to content to reflect property usage better
* Reworked many mappings to fix cascades and relationships
* Add constructors to entities where useful
* Renamed many db entities classes to be different than interface names
* Re-generated initial migration and added convenience npm script for running it
* Added ormconfig for typeorm 3 cli
* Create all run/check/rule/action premises (static info) on manager creation
Makes CM less sensitive to random blips in reddit API by enabling snoowrap to retry some network-issue related error codes alongside status codes
* Add timeoutCodes as an operator configurable array of error codes that is passed to extended snoowrap
* Patch snoowrap to check for timeout codes on request error and use retry logic if found
* Additionally, add retryErrorCodes (status codes) to operator configuration
* Refactor main author filter logic into subredditresources to take advantage of cache provider
* Implement methods to retrieve and cache subreddit moderators and author information
* Refactor live stats to work for "All Subreddits" as well as individual subs
* Refactor live stats to take place of opStats and update almost all bot stats live now (only cache breakdown TODO)
* Refactor opstats to return status of bots/subreddits only for ui indicators in tabs
* Move activity source normalization and verification into own function (thrown on invalid source string)
* Correct source-filter comparison by comparing source to filter rather than other way around to make sure inclusive filter is passed
* Also rename item filter from 'dispatch' to 'dispatched' to match verb tense of other state properties
* Simplify identifier property name in config to just 'identifier' -- there's enough context for what it is already
* Correctly render system logs to html
* Simplify websocket logging so it matches how logs are received o browser from server
* Fix instance redirect name when no friendly is set for api config
Using mocha, chai, and nyc
* tests for parsing string for numeric value comparison
* tests for parsing string for durations and duration comparisons
* tests for parsing reddit entity (subreddit/user) from string
* tests for parsing submission/comment id from reddit permalink string
* tests for initial config parsing/merging
Still can't get nyc to get coverage for everything in src using "all" -- causes reporting to show 0 for everything??
* Add Runs to main docs readme and concepts
* Add high level diagram in main docs readme to show CM lifecycle
* Refactor subreddit/rule examples to use runs syntax
* Was causing uncaught promise rejection in userflairaction because it should have been accessing name instead of id
* Wrap all as/is utility functions where value may be from cache (plain object) or proxy (from snoowrap) with try-catch to prevent any more uncaught promise rejections -- would rather swallow silently (for now) than crash the entire application
* Implement rerun configuration that satisfies requirements from #72
* rerun as action
* optional, user-defined identifier
* cancel rerun as action
* cancel based on re-queued sources
* on existing behavior
* can specify initial goto
* filter item by source (where item was retrieved from for non-cached items)
* filter item by rerun state/identifier
* Add rerun label to event logging
* Add rerun data to actioned event data
* Add id and activity type to event activity data
* Include parent submission activity data if activity is a comment
* Refactor event page ui to simplify headers and move content into collapsible
* Add context to content by including submission context for comments
* Allow FilterResult as a property result
* Remove pre-item testing cache optimization for submissionState to simplify flow
* Helped reduce key count but not worth the cost of overly complex code for returning filter results
* Remove expected prop from results and instead use criteria in filter results to generate this for logs/events
* Refactor log/event generation to handle FilterResult in filter property result
Fixes edge case where a usernote was created by a moderator that no longer mods the sub
* Store mod index so we can recreate note even if moderator is missing
* Refactor moderator hydration on usernote from raw data to just warn if moderator cannot be found
* Only return logs for "default viewed" subreddit/bot when fetching instance status, when specified from QS
* Greatly reduces amount of data fetched and response time
* Return logs with formatted property for non-streaming response
* Implement server live stats endpoint to return subreddit/all stats based on QS
* Use client websocket connection to return stats for currently viewed subreddit
* Move "sorting" log objects into lists for retrieval from server and into bot/managers for each log object type
* Refactor log filtering and aggregration under status/log endpoints to use logs from each entity rather than pulling from server
Reduces complexity in historical log data structures at the expense of slightly more runtime data crunching. The trade-off is well worth it and paves the way for easier retrieval of single/subsets of logs
* Fixes an issue where the cached notes for a user only contain the last added note instead of all notes + new
* Also reduced api calls by caching moderator adding new note instead of calling each time
If the item is not actually removed (it's hard to tell from reddit api) we don't want to prematurely end remove action. Just warn and try to remove anyway
* Refactor events view to show checks within runs
* Build cohesive runs server-side before rendering so user can see all checks in a run together
* Add collapse/expand behavior for activity/run/check with ability to toggle based on triggered state
* Default to collapsing all non-triggered states
* Build check summary on-the-fly instead of storing in event result data
* Store migration state in cache instance
* Migrate on default cache init or private cache init
* Implement first migration to deal with run structure in actioned events
* Add Run and postCheckBehavior config structures to schema and interfaces
* Implement parsing from config and initial flow logic for running on activities in manager
* Implement glob pattern or regex as argument
* Implement scan search for redis for efficiency otherwise iterate keys using generic function
* Implement cache reset based on passed item from action -- reset item crit for activities, author crit for users, and overwrite any cached activity
* Makes error cause easier to see in stack and fixes error now logging during action failure
* Use error with cause for logging action error for clearer stack
* Add Run and postCheckBehavior config structures to schema and interfaces
* Implement parsing from config and initial flow logic for running on activities in manager
* Use same technique as repost rule which has high accuracy and let false-positives
* Implement ability to see similarity score, case sensitivity, and text transformations
* Set running to false when error is caught. Was not caught on last stream refactor which changed polling behavior to end if any error is caught rather than waiting for external source to clear interval
* Add debugging/error messages on polling start/stop
* Use ErrorWithCause so we can get and print a chain of error causes
* Make reddit error response in stack trace more readable by replacing them with a "translated" parent response and add them as the cause
* Properly handle error formatting for winston by looking at shape of log object for error rather than testing instanceof (see comments in errorAwareFormat)
* Fix formatting in web interface for log lines with white-space pre css and properly splitting timestamp from rest of the message
* Implement declaration file for snoowrap errors so they can be imported directly
* Implement logging function to handle boilerplate for known error responses (reddit HTTP response, rate limit, etc.)
* Add properties for file, console, and stream in logging object of operator config
* Each property inherits a (useful) subset of winston transport options
Since snoowrap's WikiPage isn't a "real" object setting it as a property on the class means if it rejects the whole application crashes. Fix this by building wiki proxy every time we need it before awaiting promise for edit/retrieval so that promise scope is bound to the function we are in (that has try-catch)
* Use interface for comparison results at both criteria property level and criteria level
* Implement summary functions to build string results of comparisons
* Output all comparisons to debug and provide summaries to verbose (when applicable)
* Don't just overwrite (duh)
* Drop any default filters that include object keys that are also present in user-defined filters -- this way user-defined always takes precedence on merge
* Add options for /logs endpoint to stream objects instead of strings
* Always return log objects from /status endpoint -- fixes bug where all bots/subreddits got lines from logs that had newlines
* Return context-aware, formatted log lines to client to reduce line length IE if returning to botA -> subA then do not need to include labels for botA,subA #40
* Shorten timestamp to just time and wrap full timestamp in tooltip #40
* Emit log objects to client to reduce parsing complexity (don't have to regex for bot/subreddit name)
* write to config when bot is added
* replace/add based on existing bot
* implement specify instance from instances user is operator of
* implement specify subreddits to run on using comma-separated list
* rewrite invite flow ending to be more clear on results and next steps
* use node-comment and yaml@next to keep comment information intact
* store ast/source version of parsed config for operator
* implement generic yaml/json operator config classes to keep everything organized and simplify marshalling source to js/string
* refactor file parsing and json/yaml parsing to have better single responsibility
* Add excludeCondition to control how exclude sets are tested (and/or)
* Refactor authorIs logic from check/rule/action into standalone function (DRY)
* Simplify filter defaults -- don't need to specify automoderator since it is always a mod
* Add default behavior config to operator and manager config
* Implement configurable behavior when filter is present on check
* Add defaults to exclude mods and automoderator from checks
* Remove unused clearProcessing code
* Use same data structures (Map) for storing polling objects in both Manager and Bot to reduce cognitive load and re-use some logic
* Rename "mod" streams to "shared" streams
* Implement detection and updating of polling when manager config changes
* Implement detection and updating of shared streams on manager config update
* Use shared retry handler for manager polling to better handle general reddit api issues (all polling stops faster)
* Move initial polling buffer into polling object (instead of in manager) for better logic encapsulation and add debug logging for it
* Add more debug logging for manager/bot poll building
* Refactor polling config to use new 'shared' string list of polling sources and deprecate 'sharedMod' property
* Refactor how shared sources are built to look for shared intention in manager polling options before creating
* Implement continuity check for comment/submission polling to ensure no activities are missed
* Add debug logging to polling
* Add abstract user class with auth methods with implementations for client/server
* Refactor client/server logic to use class methods instead of inline auth checks
Closes#71
* Support fetching from reddit wiki
* Support fetching from raw URL
* Support parsing and fetching from gist, github blob, and regexr (very experimental)
* Change manager acquisition so all managers belong to a bot before they start logging so all logs are captured correctly
* Fix log capture logic that prevented all subreddits from being populated
Cache reported items or new comments made by bot for a short time (default to twice polling interval, 1 minute) to prevent bot from running on things it did itself
* Reduce retry for snoowrap to 2 since we do our own error handling in-app and 2 is enough for the occasional, non-systemic blip
* Reduce manager retries
* Fix polling timeout to actually stop on error by simplifying timeout and waiting until response is OK to recreate next timeout call
* Use "unexpected exception" retry count for all non well-known "reddit blip" responses in retry handler rather than failing immediately AND log this distinction
* Fix managers not emitting errors from checks
* Fix bot not awaiting retry handler on manager error emit
* Increase nanny loop delay on error to reduce api pressure when there are many bots running
* (unrelated) set bot as running before starting managers so UI is available earlier
* Implement all entities for actioned events
* Pass global database to subreddit resources
* Read/write actioned events using database, constraining to bot/subreddit
* Add typeorm dependency with backend drivers for sqljs and, optionally, postgres/mongodb/mysql/mariadb
* Add operator configuration structure for global database connection
* Implement config parsing and defaults for sqljs db location or in-memory fallback
#66
* Can flair user on comment/submission
* fix dryrun if-else block (maybe a debugging artifact?)
* allow all properties to be undefined/null/empty and use as intention to unflair user
For some reason providing the data directly to a new model doesn't trigger validation and also had some other weird effects. Instead, using an empty string as initial value and then set model value to data afterwards -- which fixes everything b/c idk why.
* Default format to yaml
* Add detected config format as property to manager
* When neither format is valid use starting character to (naively) detect json or not
* Reduce config error noise by only show one format error based on likely type and print other to debug
Reddit users can have hyphens in their names. Slight tradeoff for allowing hyphens for subreddit names (they are non existent) to allow all valid reddit user names is worth it.
Found and corrected by @prometheus-22
* Store subreddits to try to accept invites from in bot's default cache
* Handle invitation scenarios (none, modself missing, accepted) and starting manager after invitation
* Implement basic invitation acceptance list control in UI for bot operators
* Only try page creation if response error is a 404
* Improve permission error descriptions
* Only create if it can also set page permissions to improve security
* Use stored scope to determine if user can save
* Only show save action if loaded from a subreddit
* Implement re-authorization flow through popup window and sockets to update status in editor
* Implement wiki location endpoint for server and wiki save endpoint for client
* monaco-yaml can also do json validation since its just normal monaco
* simplifies config.ejs greatly not having to maintain two different monaco implementation, at the expense of a larger project
* Use named entry files for webpack so we can access them from config.ejs
* Move some of the validation logic to config.ejs so it can be used by json editor as well
* Use same controls for loading files/schemas/urls for both editors
* Add validation errors box for both editors
* Use 3 different matching algorithms using the highest score out of the three
* Weight score based on length of the sentence
* Increase minimum number of words to 3
* Enforce min word count on external (youtube) comments
* use --no-cache so repository index isn't included in image
* use FROM instructions to build project with dev dependencies, then copy built project to new stage and install production-only libraries
* properly ignore docs and node_modules with dockerignore
* depended on and always downloaded an entire, older typescript version (even with production install) -- was not necessary for one function
* refactor project to use newer TS version (specify any type for catch blocks to fix compiler errors)
* Add convenience property for filtering subreddits by user profile prefix (u_)
* Do some smart property comparison to check if SubredditState name is regex or not -- use additional user prof check if it is so we don't clobber existing name regex
Closes#56
* Simplify usage b/w comment-submission by removing "criteria" in main criteria property -- comment checks can just use additional properties
* Consolidated occurrence count/time into one property to allow and/or operands on both conditions (more powerful!)
* Added documentation describing repost configuration
* Added repost configuration examples
* Turn off snoowrap request queuing
* Aggregate errors from all managers at bot-level and force all to stop (and clear queue/polling) if a small threshold is met
* Add activity refresh on check into try-catch so delayed activities in queue don't cause a loop if they fail due to api issues
* User-defined set of comparisons for testing how many reposts were found
* User-defined set of comparisons for testing when a slice/all of reposts were created
* checks for both submission/comment
* search by submission title, url, dups/crossposts, and external (just youtube for now)
* user-defined string sameness for all search facets
* user-defined case-sensitivity and regex-based transformations for activity/repost item values
* cache comment checks
* implemented youtube client for retrieving video comments
The user profile for the bot account is returning as a moderated "subreddit" EX (/u/ContextMotBot) but its not useable by CM because it does not have a wiki and users don't use this as a sub. Filter it out of moderated subreddits when using the default behavior.
One of these properties must be present for the rule to be valid. This is enforced at runtime. Change the schema so that both are optional from the config-side though.
* Apply changes made in ac409dce to the rest of the application
* Remove dark/light mode -- now always dark mode (easier to maintain this way)
* Remove dependency on tailwind dark css
* Use different logging messages when criteria is not available due to mod permissions (property not available to non-mods)
* Change logging level for missing/unavailable criteria to reduce logging noise. On unavailable use debug, on missing use warn
* Improve activity removed/deleted detection based on whether activity is moddable by current user
* Generate perceptual hashes using blockhash-js of images that can be cache/stored
* Take advantage of reddit thumbnail code ImageData to hash lower-res to begin with (but represent full url)
* Refactor imageDetection config so hash and pixel approaches have different configs
* Cache phash results to reduce reddit traffic and speed up performance
Addresses cpu/memory issues with pixel comparison. Allow pixel for finer comparisons if needed using tiered thresholds. Closes#26
* Refactor image comparison to use pixelmatch only so resemblejs can be removed (too much memory usage)
* Heavier usage of sharp to get images into same dimensions prior to pixelmatch
* Refactor image conversion into ImageData to clean up utils/recent activity rule
* Implement staggered startup for bots (reddit accounts, top-level)
* Implement staggered startup for managers (subreddits) and subreddit polling
* Introduce random -1/+1 second to polling interval for every stream to ensure none are synced so there is no instantaneous spike in cpu/traffic/memory on host/reddit
* Add user-configurable stagger interval for shared mod polling
* Implement second image comparison approach with pixelmatch for reduced memory usage when image dimensions are exactly the same
* Use sharp to resize images to 400 width max when using resemblejs to reduce memory usage
* Refactor image acquisition/parsing to use on-demand fetching and track different resolutions
* Try to use smaller previews (under 1000px), when possible, for comparing images and downloading
* Do image comparisons in parallel
* Change 'after' type to string duration for friendlier configuration
* Decrease list size trigger === limit instead of 2x (not necessary to have a list that big for polling new)
* Increase initial shared mod polling to max limit (100)
The list of processed activities SnooStorm uses to ensure only new activities are emitted when polling is never cleared. MayorMonty/Snoostorm#35
To mitigate the memory bloat this creates when RCB runs for a long time on high-volume subreddits implement user-configurable (with defaults) behavior for clearing the processed activity list. Default values ensure clearing the list does not interfere with checking for new activities.
When using full criteria for subreddit state we can save a ton of api calls by get info for all uncached subreddits at the same time rather than individually
* Don't store subreddit state cache results for now since nothing computationally expensive or requires api requests
* Return early on item state check if there is nothing to check so we don't store an empty result in cache
* allow both over_18 and over18 criteria in case user accidentally used name from sub/comm state
* correctly determine if subreddit property exists when testing
* fix cache hit subreddit name logging
* Trim value before parsing
* If not a valid regex string then when generating regex from simple string add qualifiers for beginning/end of string so any matches must be exact
* By dynamically importing the module any user not using image comparison will not be affected by a lack of node-canvas dependency
* try-catch on import and provide a helpful error message about node-canvas dep
Majority of mods that have used this rule assume it does not aggregate on reddit domains by default (only external links), which is reasonable.
So update the default to follow this assumption.
* Add `redditMedia` as distinct domain type from `self` for more granular aggregation
* Use `redditMedia` to fix bug where video and galleries were being counted as `media`
* Allow checking if user is shadowbanned via authorIs (AuthorCriteria)
* try-catch on history get or author criteria to try to detect shadowbanned user for a more descriptive error
* Can send message to any entity (user/subreddit) using 'to' property, or leave unspecified to send to author of activity
* Parse entity type (user or subreddit) from to value and ensure its in a valid format we can understand with regex
Previously CM would process the same activity multiple times if it was ingested from two different polling sources (modqueue and unmoderated/newSub). Introduce queue control flow to ensure activity is de-duped or refreshed before processing if this scenario occurs.
* Use a queue (firehose) to bottleneck all activities from different sources before pushing to worker queues
* Keep track of items currently ingested but not completely processed and use firehose to de-dupe queued items (flag to refresh) or re-queue if currently processing (and flag to refresh)
* Implement total threshold to compare filtered activities against window activities
BREAKING CHANGE: include/exclude now filters POST activity window and all comparisons are done on those filtered activities against window activities
* SubredditState can be used to check some subreddit attributes alongside, or in place of, a subreddit name
* Regex parsing for subreddit name string in recent activity
Reduce regex complexity in config by parsing a normal regex straight from config string value (including flags)
BREAKING CHANGE: regex must now be enclosed in forward slashes, flags must be on regex value, and regexFlags property has been removed
* Default to global flag if none specified so that all matches per activity are found
* Improve result message section ordering and display a sample of up to 4 found matches
* Refactor api to get all accessible events, sorted by time, when subreddit is not specified
* Add subreddit name to actioned event data to differentiate between events
* Show actioned events link in "All" subreddit view
* Remove user-select css style (left over from config template)
* Format timestamp to be more human friendly
* Remove success/triggered text and just use checkmarks (same as log)
* refactor actioned events into bot-configured cache so they can be persisted between restarts
* add config params for actionedEventsMax and actionedEventsDefault to allow defining defaults at operator/bot/subreddit level
* Missed heartbeat during client-server refactor somehow...oops. Re-add heartbeat behavior
* Refactor nanny functionality to use date check rather than loop -- behaves same as heartbeat now
* use http retry handling in nanny to handle reddit outages
* try-catch on nanny and heartbeat for better exception handling at bot-level
* await health loop so we can catch bot-level exceptions in app to prevent entire app from crashing
* Make keys for readable by using plaintext for unique values and only hashing objects
* Improve author criteria caching by excluding item identifier from hash since result should be same at subreddit-level
* Add special SHARED namespace for subreddits using default cache -- remove ns when cache is dedicated
* Check for redis cache type and include prefix pattern when getting key count
* Re-add operator-level caching config so a global default cache config can be defined
* Expand provider options with index property so additional, redis specific, params can be provided
* namespace (prefix) bot and subreddit-level (When not shared) redis connections
* refactor subreddit and author name usage since it differs when objects are deserialized
* as type guard for submission based on instance type OR object shape hint since deserialized activities are plain objects
* When check is triggered also store rule results in cache -- makes actioned events more complete when actions run from cached result
* Add config option to toggle run actions on/off from cached check (Defaults to on)
* Use checks triggered display instead of own events actions stats (since it already exists and is the same)
* More visual separation between action events using shadowed boxes
* Move timestamp and title into a header (more visual distinction) and use short hyperlink
* Move rule summary up into check field
* Refactor action return value to return results/success/error
* Store action event data after triggered from manager
* Display last 25 actioned events in ui
* Update "no access" render to reflect current architecture
* Get rid of defaultBot middleware and replace with client-side parsing
* Do not reveal instance information when user does not have access to a route
* Fix any authenticated user being able to access dashboard (condition flipped)
* User can specify invite code so urls are friendly and can be recreated on instance restart
* Replace json view with monaco-editor with schema loaded based on url param
* Add route for unauthenticated config editing
* Auto-load subreddit config when "view" is clicked from subreddit view
* Use autolinker to deal with url parsing in logs and sanitize html
* Fix missing mergeArr arguments on child loggers
* Implement DuplexTransport to make streaming from winston less verbose and allow access to log object
* Refactor log parsing on server/client side to deal with log objects or strings
* Integrate into the main client
* Do more checks for correct credentials before auth code flow init
* Implement special bypass token to help with authorization flow with account other than operator/logged in
* Add more credential options to helper page (can provide separate id/secret) and permission options
* Max 5 retries using got built-in behavior, with logged warnings/errors
* Reload client window after log setting update as its easier to do setup logs from a fresh load then in-situ
* Cleanup stream usage for winston and simplify adding to logger
* Use pipeline with delimiter stream for parsing logs from client
* End log streaming with promise abort (cleaner)
* Move Bot webhook listeners for crash/shutdown into Bot code itself
* Simplify running modes
* Parse web operator into user data
* Add basic status tooltip for bots in ui
* timeout returns no argument but we have bound the function so just reassign this to self
* fix errenous save left over from previous changes
* keep track of number of usernotes to save and log to debug
* log to debug when executing save immediately due to cache miss
* enable bot to send private messages as self or modmail
* add necessary permissions to oauth helper to make this possible
* update schema with message action structure
* Lodash has issues iterating over properties because the items can be a `proxy`
* We're not modifying the items at any point anyway so cloning to preserve state isn't necessary
* We want to get proxy objects back (potentially) since we'll still be able to use them for retrieving more data later
They don't like having reddit as part of the name unless its "for reddit" -- so just remove reddit from name altogether and also drop "bot" as its superfluous. Shorter name is better anyway
* Reorganize some main readme contents into separate docs
* Add full bot authentication guide
* Add screenshots and web interface section
Some link fixes and clarifications
Fix another link
Fix another link
More doc cleanup
More doc cleanup
More doc cleanup
Add link to docs in main readme summary
* Refactor server so app is passed back to main index.js so we can handle SIGTERM in a central location and determine if exit was based on uncaught error or not
* Fix missing assignment of default wikiLocation to manager
* await discord notifier so on app exit the notification is actually sent before exit
* Check for in-use web port before starting so we can display a useful error message
* Remove 'web' command and instead make it an optional argument for `run` (default to web)
* Update dockerfile to use run command
* Remove remaining commander option defaults since they are defined in config builder now
* Add more caching information
* Add more visual distinction for selected subreddit
* Add querystring based on shown subreddit so view persists between reload
* Use cache manager instance in UserNotes and callback to report stats
* Simplify sub resource config interface and setter
* Use cache manager for express session
* Add session cache provider to operator config
* use node-cache-manager so operator has a choice of memory or redis
* BC update TTL values to be in seconds instead of milliseconds
* Count requests and misses for cache
* display cache stats in ui
* Implement a weak/strong interface for operator (app) config
* Consolidate config parsing code into ConfigBuilder
* Implement parsing configuration from a json file
* Refactor configuration parsing so there is a clear hierarchy to where and how config is overridden
* Fuzzy match on all identifiers to enable detecting small variations in title/body
* For reddit media submissions (image/video) check title instead of url
* For submissions with external url also check title
* Move theme toggle to title to make room in the subreddit tabs when there are many, many subreddits
* Implement check permalink with subreddit context -- can run any permalink on a subreddit's config
If operator is running subreddits for lop-sided traffic burdens then /r/mod may be over the 100 activity limit, all from subreddit X while sub Y and Z have few activities. In this scenario Y and Z would never run since X would take up all results. Therefore default to individual modqueues and make shared modqueue opt-in by operator
For some reason using client *again* here was causing the primary client usage to lose response data. I think perhaps it needs to be cloned or need to create a new instance all-together rather than trying to re-use.
* Use async/queue for handling activity processing on each manager
* Refactor polling to push activities into activity queue
* Refactor manager state to control activity processing queue and event polling independently
* Pause activity processing and wait until no work is being done to do config update
* Add way more logging for new systems
* Add basic ui controls/view for new systems
* For self submissions use "title + body" to identify repeats instead of just body -- body could potentially be empty
* Add "minWordCount" with default of 1 so that blank comments are ignored
Erroneously returning proxy object instead of actual submission. Dont' really need the title since its in the permalink so remove it entirely to avoid another api call
* Makes it easier to deal with never-successful subs (from startup) and invalid configs during run
* Paves the way for adding managers in-situ
* Add validConfig property to manager to track this and in UI
* Track if user manually stopped manager so we don't try to restart on heartbeat
If reddit has a general outage this is where we want to wait the longest before giving up entirely since polling is infrequent (relative to running checks)
* fix other retry compare statement
* check request for status code missing (timeout, socket timeout, address unavailable, etc.) or valid status code
* clarify wording in retry handler logging
* Add api limit, reset, and heartbeat interval to All overview when operator is viewing
* Stream op stats on log emit
* Add env/arg for setting "Operated by"
* Refactor program options to allow running as web
* Implement authentication using reddit oauth
* Use session in memory to store identification and accessible subreddits
* Implement socket.io with shared session to enable streaming logs
* Implement log streaming with per-subreddit views
* Re-implement all polling streams with extended poll class for catching errors
* Increase default polling interval to 30 seconds and limit to 50
* When using default interval/limit for mod polling use a shared stream between all managers and poll to /r/mod -- reduces api calls for polling mod streams to 1 regardless of how many subreddits are managed
* Separate manager instantiation from configuration flow so config can be reloaded
* Move wiki page parsing into manager for better encapsulation
* Check for wiki revision date on heartbeat and on checks if older than one minute
* Catch config parsing issues and retry on next heartbeat
* Allow whitespace on either side of regex value to parse since its automatically trimmed by getting capture group
* Implement sub name parsing everywhere subreddits can be specified and update documentation to remove prefix restrictions
* Add missing include/exclude behavior for counting repeat submissions
* Add parameter to enable user to specify if removed activities should be included
* Fix how removed check is performed since there are different behaviors for submission/comment
* Add filtered and deleted states for item check
* Add subreddit filters (include/exclude) on window criteria
**Context Bot** is an event-based, [reddit](https://reddit.com) moderation bot built on top of [snoowrap](https://github.com/not-an-aardvark/snoowrap) and written in [typescript](https://www.typescriptlang.org/).
**Context Mod** (CM) is an event-based, [reddit](https://reddit.com) moderation bot built on top of [snoowrap](https://github.com/not-an-aardvark/snoowrap) and written in [typescript](https://www.typescriptlang.org/).
It is designed to help fill in the gaps for [automoderator](https://www.reddit.com/wiki/automoderator/full-documentation) in regard to more complex behavior with a focus on **user-history based moderation.**
An example of the above that Context Bot can do now:
An example of the above that Context Bot can do:
> * On a new submission, check if the user has also posted the same link in **N** number of other subreddits within a timeframe/# of posts
> * On a new submission or comment, check if the user has had any activity (sub/comment) in **N** set of subreddits within a timeframe/# of posts
>
>In either instance Context Bot can then perform any action a moderator can (comment, report, remove, lock, etc...) against that user, comment, or submission.
Some feature highlights:
* Simple rule-action behavior can be combined to create any level of complexity in behavior
*One instance can handle managing many subreddits (as many as it has moderator permissions in!)
*Per-subreddit configuration is handled by JSON stored in the subreddit wiki
*Any text-based actions (comment, submission, message, usernotes, etc...) can be configured via a wiki page or raw text in JSON
*All text-based actions support [mustache](https://mustache.github.io) templating
*History-based rules support multiple "valid window" types -- [ISO 8601 Durations](https://en.wikipedia.org/wiki/ISO_8601#Durations), [Day.js Durations](https://day.js.org/docs/en/durations/creating), and submission/comment count limits.
*Checks/Rules support skipping behavior based on:
*author criteria (name, css flair/text, moderator status, and [Toolbox User Notes](https://www.reddit.com/r/toolbox/wiki/docs/usernotes))
*Activity state (removed, locked, distinguished, etc.)
*Rules and Actions support named references so you write rules/actions once and reference them anywhere
*User-configurable global/subreddit-level API caching
*Support for [Toolbox User Notes](https://www.reddit.com/r/toolbox/wiki/docs/usernotes) as criteria or Actions (writing notes)
*Docker container support
Feature Highlights for **Moderators:**
*Complete bot **autonomy**. YAML config is [stored in your subreddit's wiki](/docs/subreddit/gettingStarted.md#setup-wiki-page) (like automoderator)
*Simple rule-action behavior can be combined to create **complex behavior detection**
*Support Activity filtering based on:
*[Author criteria](/docs/subreddit/components/README.md#author-filter) (name, css flair/text, age, karma, moderator status, [Toolbox User Notes](https://www.reddit.com/r/toolbox/wiki/docs/usernotes), and more!)
*[Delay/re-process activities](/docs/subreddit/components/README.md#dispatch) using arbitrary rules
*[**Image Comparisons**](/docs/imageComparison.md) via fingerprinting and/or pixel differences
*[**Repost detection**](/docs/subreddit/components/repost) with support for external services (youtube, etc...)
*Event notification via Discord
* [**Web interface**](#web-ui-and-screenshots) for monitoring, administration, and oauth bot authentication
* [**Placeholders**](/docs/subreddit/actionTemplating.md) (like automoderator) can be configured via a wiki page or raw text and supports [mustache](https://mustache.github.io) templating
* [**Partial Configurations**](/docs/subreddit/components/README.md#partial-configurations) -- offload parts of your configuration to shared locations to consolidate logic between multiple subreddits
* [Guest Access](/docs/subreddit/README.md#guest-access) enables collaboration and easier setup by allowing temporary access
* [Toxic content prediction](/docs/subreddit/components/README.md#moderatehatespeechcom-predictions) using [moderatehatespeech.com](https://moderatehatespeech.com) machine learning model
Feature highlights for **Developers and Hosting (Operators):**
* Easy, UI-based [OAuth authentication](/docs/operator/addingBot.md) for adding Bots and moderator dashboard
* Integration with [InfluxDB](https://www.influxdata.com) for detailed [time-series metrics](/docs/operator/database.md#influx) and a pre-built [Grafana](https://grafana.com) [dashboard](/docs/operator/database.md#grafana)
# Table of Contents
* [How It Works](#how-it-works)
* [Installation](#installation)
* [Configuration](#configuration)
* [Examples](#examples)
* [Usage](#usage)
* [Getting Started](#getting-started)
* [Configuration And Documentation](#configuration-and-documentation)
* [Web UI and Screenshots](#web-ui-and-screenshots)
### How It Works
Context Bot's configuration is made up of an array of **Checks**. Each **Check** consists of :
Each subreddit using the RCB bot configures its behavior via their own wiki page.
When a monitored **Activity** (new comment/submission, new modqueue item, etc.) is detected the bot runs through a list of [**Checks**](/docs/subreddit/components/README.md#checks) to determine what to do with the **Activity** from that Event. Each **Check** consists of :
#### Kind
@@ -48,193 +67,98 @@ Is this check for a submission or comment?
#### Rules
A list of **Rule** objects to run against the activity. If **any** Rule object is triggered by the activity then the Check runs its **Actions**
A list of [**Rules**](/docs/subreddit/components/README.md#rules) to run against the **Activity**. Triggered Rules can cause the whole Check to trigger and run its **Actions**
#### Actions
A list of **Action** objects that describe what the bot should do with the activity or author of the activity. The bot will run **all** Actions in this list.
A list of [**Actions**](/docs/subreddit/components/README.md#actions) that describe what the bot should do with the **Activity** or **Author** of the activity (comment, remove, approve, etc.). The bot will run **all** Actions in this list.
___
The **Checks** for a subreddit are split up into **Submission Checks** and **Comment Checks** based on their **kind**. Each list of checks is run independently based on when events happen (submission or comment).
When an event occurs all Checks of that type are run in the order they were listed in the configuration. When one check is triggered (an action is performed) the remaining checks will not be run.
When an Event occurs all Checks of that type are run in the order they were listed in the configuration. When one check is triggered (an Action is performed) the remaining checks will not be run.
## Installation
___
[Learn more about the RCB lifecycle and core concepts in the docs.](/docs/README.md#how-it-works)
### Locally
## Getting Started
Clone this repository somewhere and then install from the working directory
See the [Moderator's Getting Started Guide](/docs/subreddit/gettingStarted.md)
## Configuration and Documentation
## Configuration
Context Bot's configuration can be written in YAML (like automoderator) or [JSON5](https://json5.org/). Its schema conforms to [JSON Schema Draft 7](https://json-schema.org/). Additionally, many **operator** settings can be passed via command line or environmental variables.
Context Bot's configuration can be written in JSON, [JSON5](https://json5.org/) or YAML. It's [schema](/src/Schema/App.json) conforms to [JSON Schema Draft 7](https://json-schema.org/).
* For **operators** (running the bot instance) see the [Operator Configuration](/docs/operator/configuration.md) guide
* For **moderators** consult the [app schema and examples folder](/docs/README.md#configuration-and-usage)
I suggest using [Atlassian JSON Schema Viewer](https://json-schema.app/start) ([direct link](https://json-schema.app/view/%23?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Freddit-context-bot%2Fmaster%2Fsrc%2FSchema%2FApp.json)) so you can view all documentation while also interactively writing and validating your config! From there you can drill down into any object, see its requirements, view an example JSON document, and live-edit your configuration on the right-hand side.
[**Check the full docs for in-depth explanations of all concepts and examples**](/docs)
### Examples
## Web UI and Screenshots
Read through the [Examples](/examples) section for a thorough introduction to all the **Rules**, in-depth concepts, and sample configuration files.
### Dashboard
### Action Templating
CM comes equipped with a dashboard designed for use by both moderators and bot operators.
Actions that can submit text (Report, Comment) will have their `content` values run through a [Mustache Template](https://mustache.github.io/). This means you can insert data generated by Rules into your text before the Action is performed.
* Authentication via Reddit OAuth -- only accessible if you are the bot operator or a moderator of a subreddit the bot moderates
* Connect to multiple ContextMod instances (specified in configuration)
* Monitor API usage/rates
* Monitoring and administration **per subreddit:**
* Start/stop/pause various bot components
* View statistics on bot usage (# of events, checks run, actions performed) and cache usage
* View various parts of your subreddit's configuration and manually update configuration
* View **real-time logs** of what the bot is doing on your subreddit
* **Run bot on any permalink**
See here for a [cheatsheet](https://gist.github.com/FoxxMD/d365707cf99fdb526a504b8b833a5b78) and [here](https://www.tsmean.com/articles/mustache/the-ultimate-mustache-tutorial/) for a more thorough tutorial.
All Actions with `content` have access to this data:
### Bot Setup/Authentication
```json5
{
item: {
kind: 'string', // the type of item (comment/submission)
author: 'string', // name of the item author (reddit user)
permalink: 'string', // a url to the item
url: 'string', // if the item is a Submission then its URL (external for link type submission, reddit link for self-posts)
title: 'string', // if the item is a Submission, then the title of the Submission,
botLink: 'string' // a link to the bot's FAQ
},
rules: {
// contains all rules that were run and are accessible using the name, lowercased, with all spaces/dashes/underscores removed
}
}
A bot oauth helper allows operators to define oauth credentials/permissions and then generate unique, one-time invite links that allow moderators to authenticate their own bots without operator assistance. [Learn more about using the oauth helper.](docs/operator/addingBot.md#cm-oauth-helper-recommended)
```
Operator view/invite link generation:
The properties of `rules` are accessible using the name, lower-cased, with all spaces/dashes/underscores. If no name is given `kind` is used as `name` Example:

```
"rules": [
{
"name": "My Custom-Recent Activity Rule", // mycustomrecentactivityrule
"kind": "recentActivity"
},
{
// name = repeatsubmission
"kind": "repeatActivity",
}
]
```
Moderator view/invite and authorization:
**To see what data is available for individual Rules [consult the schema](#configuration) for each Rule.**

#### Quick Templating Tutorial
A similar helper and invitation experience is available for adding **subreddits to an existing bot.**
As a quick example for how you will most likely be using templating -- wrapping a variable in curly brackets, `{{variable}}`, will cause the variable value to be rendered instead of the brackets:
```
myVariable = 50;
myOtherVariable = "a text fragment"
template = "This is my template, the variable is {{myVariable}}, my other variable is {{myOtherVariable}}, and that's it!";
"This is my template, the variable is 50, my other variable is a text fragment, and that's it!";
```
A built-in editor using [monaco-editor](https://microsoft.github.io/monaco-editor/) makes editing configurations easy:
**Note: When accessing an object or its properties you must use dot notation**
```
const item = {
aProperty: 'something',
anotherObject: {
bProperty: 'something else'
}
}
const content = "My content will render the property {{item.aProperty}} like this, and another nested property {{item.anotherObject.bProperty}} like this."
```
</details>
* Automatic JSON or YAML syntax validation and formatting
* Automatic Schema (subreddit or operator) validation
* All properties are annotated via hover popups
* Unauthenticated view via `yourdomain.com/config`
* Authenticated view loads subreddit configurations by simple link found on the subreddit dashboard
* Switch schemas to edit either subreddit or operator configurations
-c, --clientId <id> Client ID for your Reddit application (default: process.env.CLIENT_ID)
-e, --clientSecret <secret> Client Secret for your Reddit application (default: process.env.CLIENT_SECRET)
-a, --accessToken <token> Access token retrieved from authenticating an account with your Reddit Application (default: process.env.ACCESS_TOKEN)
-r, --refreshToken <token> Refresh token retrieved from authenticating an account with your Reddit Application (default: process.env.REFRESH_TOKEN)
-s, --subreddits <list...> List of subreddits to run on. Bot will run on all subs it has access to if not defined (default: process.env.SUBREDDITS (comma-seperated))
-d, --logDir <dir> Absolute path to directory to store rotated logs in (default: process.env.LOG_DIR || process.cwd()/logs)
-w, --wikiConfig <path> Relative url to contextbot wiki page EX https://reddit.com/r/subreddit/wiki/<path> (default: process.env.WIKI_CONFIG || 'botconfig/contextbot')
--snooDebug Set Snoowrap to debug (default: process.env.SNOO_DEBUG || false)
--authorTTL <ms> Set the TTL (ms) for the Author Activities shared cache (default: process.env.AUTHOR_TTL || 10000)
--heartbeat <s> Interval, in seconds, between heartbeat logs. Set to 0 to disable (default: process.env.HEARTBEAT || 300)
--apiLimitWarning <remaining> When API limit remaining (600/10min) is lower than this value log statements for limit will be raised to WARN level (default: process.env.API_REMAINING || 250)
--dryRun Set dryRun=true for all checks/actions on all subreddits (overrides any existing) (default: process.env.DRYRUN)
--disableCache Disable caching for all subreddits (default: process.env.DISABLE_CACHE || false)
-h, --help display help for command
* Overall stats (active bots/subreddits, api calls, per second/hour/minute activity ingest)
* Over time graphs for events, per subreddit, and for individual rules/check/actions
Commands:
run Runs bot normally
check [options] <activityIdentifier> [type] Run check(s) on a specific activity
unmoderated [options] <subreddits...> Run checks on all unmoderated activity in the modqueue
help [command] display help for command
```
### Logging
### Reddit App??
To use this bot you must do two things:
* Create a reddit application
* Authenticate that application to act as a user (login to the application with an account)
#### Create Application
Visit [your reddit preferences](https://www.reddit.com/prefs/apps) and at the bottom of the page go through the **create an(other) app** process.
* Choose **script**
* For redirect uri use https://not-an-aardvark.github.io/reddit-oauth-helper/
* Write down your **Client ID** and **Client Secret** somewhere
* Input your **Client ID** and **Client Secret** in the text boxes with those names.
* Choose scopes. **It is very important you check everything on this list or Context Bot will not work correctly**
* edit
* flair
* history
* identity
* modcontributors
* modflair
* modposts
* modself
* mysubreddits
* read
* report
* submit
* wikiread
* wikiedit (if you are using Toolbox User Notes)
* Click **Generate tokens**, you will get a popup asking you to approve access (or login) -- **the account you approve access with is the account that Bot will control.**
* After approving an **Access Token** and **Refresh Token** will be shown at the bottom of the page. Write these down.
You should now have all the information you need to start the bot.
"description":"Refresh token retrieved from authenticating an account with your Reddit Application",
"value":"",
"required":true
"required":false
},
"ACCESS_TOKEN":{
"description":"Access token retrieved from authenticating an account with your Reddit Application",
"value":"",
"required":true
"required":false
},
"REDIRECT_URI":{
"description":"Redirect URI you specified when creating your Reddit Application. Required if you want to use the web interface. In the provided example replace 'your-heroku-app-name' with the name of your HEROKU app.",
* [Configuration and Usage](#configuration-and-usage)
* FAQ
## Getting Started
Review **at least** the **How It Works** and **Concepts** below, then:
* For **Operators** (running a bot instance) refer to [**Operator Getting Started**](/docs/operator/gettingStarted.md) guide
* For **Moderators** (configuring an existing bot for your subreddit) refer to the [**Moderator Getting Started**](/docs/subreddit/gettingStarted.md) guide
## How It Works
Where possible Context Mod (CM) uses the same terminology as, and emulates the behavior, of **automoderator** so if you are familiar with that much of this may seem familiar to you.
### Diagram
Expand the section below for a simplified flow diagram of how CM processes an incoming Activity. Then refer the text description of the diagram below as well as [Concepts](#Concepts) for descriptions of individual components.
#### 1) A new Activity in your subreddit is received by CM
The Activities CM watches for are configured by you. These can be new modqueue/unmoderated items, submissions, or comments.
#### 2) CM sequentially processes each Run in your configuration
A [**Run**](#Runs) is made up of a set of [**Checks**](#Checks)
#### 3) CM sequentially processes each Check in the current Run
A **Check** is a set of:
* One or more [**Rules**](#Rule) that define what conditions should **trigger** this Check
* One or more [**Actions**](#Action) that define what the bot should do once the Check is **triggered**
#### 4) Each Check is processed, *in order*, until a Check is **triggered**
In CM's default configuration, once a Check is **triggered** no more Checks will be processed. This means all subsequent Checks in this Run (in the order you listed them) are skipped.
#### 5) All Actions from the triggered Check are executed
After all **Actions** from the triggered **Check** are executed CM begins processing the next **Run**
#### 6) Rinse and Repeat from #3
Until all Runs have been processed.
## Concepts
Core, high-level concepts regarding how CM works.
### Event
An **Event** refers to the [Activity](#activity) (Comment or Submission) CM receives to process as well as the results of processing that Activity.
### Activity
An Activity is a Comment or Submission from Reddit.
### Runs
A **Run** is made up of a set of **Checks** that represent a group of related behaviors the bot should check for or perform -- that are independent of any other behaviors the Bot should perform.
An example of Runs:
* A group of Checks that look for missing flairs on a user or a new submission and flair accordingly
* A group of Checks that detect spam or self-promotion and then remove those activities
Both group of Checks are independent of each other (don't have any patterns or actions in common).
[Full Documentation for Runs](/docs/subreddit/components/README.md#runs)
### Checks
A **Check** is the main logical unit of behavior for the bot. It is equivalent to "if X then Y". A Check is composed of:
* One or more **Rules** that are tested against an **Activity**
* One of more **Actions** that are performed when the **Rules** are satisfied
A Run can be made up of one or more **Checks** that are processed **in the order they are listed in the Run.**
Once a Check is **triggered** (its Rules are satisfied and Actions performed) all subsequent Checks are skipped.
[Full Documentation for Checks](/docs/subreddit/components/README.md#checks)
### Rule
A **Rule** is some set of **criteria** (conditions) that are tested against an Activity (comment/submission), a User, or a User's history. A Rule is considered **triggered** when the **criteria** for that rule are found to be **true** for whatever is being tested against.
CM has different **Rules** that can test against different types of behavior and aspects of a User, their history, and the Activity (submission/common) being checked.
[Full Documentation for Rules](/docs/subreddit/components/README.md#rules)
#### Available Rules
All available rules can be found in the [components documentation](/docs/subreddit/components/README.md#rules)
### Rule Set
A **Rule Set** is a "grouped" set of `Rules` with a **trigger condition** specified.
Rule Sets can be used interchangeably with other **Rules** and **Rule Sets** in the `rules` list of a **Check**.
They allow you to create more complex trigger behavior by combining multiple rules into one "parent rule".
An **Action** is some action the bot can take against the checked Activity (comment/submission) or Author of the Activity. CM has Actions for most things a normal reddit user or moderator can do.
**Runs, Checks, Rules, and Actions** all have two additional (optional) criteria "pre-tests". These tests are different from rules/checks in these ways:
* Filters test against the **current state** of the Activity, the Author of the Activity, or the Subreddit of the Activity -- rather than looking at history/context/etc...
* Filter test results only determine if the Run, Check, Rule, or Action **should run** -- rather than triggering it
* When the filter test **passes** the thing being tested continues to process as usual
* When the filter test **fails** the thing being tested **fails**.
[Full Documentation for Filters](/docs/subreddit/components/README.md#filters)
## Configuration And Usage
* For **Operator/Bot maintainers** see **[Operation Guide](/docs/operator/README.md)**
* For **Moderators**
* Start with the [Subreddit/Moderator docs](/docs/subreddit/README.md) or [Moderator Getting Started guide](/docs/subreddit/gettingStarted.md)
* Refer to the [Subreddit Components Documentation](/docs/subreddit/components) or the [subreddit-ready examples](/docs/subreddit/components/subredditReady)
* as well as the [schema](https://json-schema.app/view/%23?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) which has
Use [act](https://github.com/nektos/act) to run Github actions locally.
An example secrets file can be found in the project working directory at [act.env.example](act.env.example)
Modify [push-hook-sample.json](.github/push-hook-sample.json) to point to the local branch you want to run a `push` event trigger on, then run this command from the project working directory:
Easiest way is to install the [docker container](https://www.mock-server.com/mock_server/running_mock_server.html#pull_docker_image) ([from here](https://hub.docker.com/r/mockserver/mockserver))
Map port `1080:1080` -- acts as both the proxy port and the UI endpoint with the below URL:
```
http(s)://localhost:1080/mockserver/dashboard
```
In your [operator configuration](/docs/operator/operatorConfiguration.md) define a proxy for snoowrap at the top-level:
```yaml
snoowrap:
proxy:'http://localhost:8010'
#debug: true # optionally set debug to true to make snoowrap requests output to log
```
## Usage
### Forwarding Requests (Monitoring Behavior)
This is what will make MockServer act as an actual **proxy server**. In this state CM will operate normally. In the MockServer UI you will be able to monitor all requests/responses made.
```HTTP
PUT/mockserver/expectationHTTP/1.1
Host:localhost:8010
Content-Type:application/json
Content-Length:155
```
<details>
<summary>CURL</summary>
```bash
curl --location --request PUT 'http://localhost:8010/mockserver/expectation'\
--header 'Content-Type: application/json'\
--data-raw '{
"httpRequest": {},
"priority": 0,
"httpForward": {
"host": "oauth.reddit.com",
"port": 443,
"scheme": "HTTPS"
}
}'
```
</details>
### Mocking Network Issues
MockServer is a bit confusing and regex'ing for specific paths don't work well (for me??)
The lifecycle of a mock call I do:
* Make sure [forwarding](#forwarding-requests-monitoring-behavior) is set, to begin with
* Breakpoint before the code you want to test with mocking
* [Mock the network issue](#create-network-issue-behavior)
* Once the mock behavior should be "done" then
* [Clear all exceptions](#clearing-behavior)
* Set [forwarding behavior](#forwarding-requests-monitoring-behavior) again
### Create Network Issue Behavior
#### All Responses return 403
<details>
<summary>HTTP</summary>
```HTTP
PUT/mockserver/expectationHTTP/1.1
Host:localhost:8010
Content-Type:application/json
Content-Length:1757
```
</details>
<details>
<summary>CURL</summary>
```bash
curl --location --request PUT 'http://localhost:8010/mockserver/expectation'\
ContextMod supports comparing image content, for the purpose of detecting duplicates, with two different but complimentary systems. Image comparison behavior is available for the following rules:
To enable comparisons reference the example below (at the top-level of your rule) and configure as needed:
JSON
```json5
{
"name": "ruleWithImageDetection",
"kind": "recentActivity",
// Add block below...
//
"imageDetection": {
// enables image comparison
"enable": true,
// The difference, in percentage, between the reference submission and the submissions being checked
// must be less than this number to consider the images "the same"
"threshold": 5,
// optional
// set the behavior for determining if image comparison should occur on a URL:
//
// "extension" => try image detection if URL ends in a known image extension (jpeg, gif, png, bmp, etc.)
// "unknown" => try image detection if URL ends in known image extension OR there is no extension OR the extension is unknown (not video, html, doc, etc...)
// if fetchBehavior is not defined then "extension" is the default
"fetchBehavior": "extension",
},
//
// And above ^^^
//...
}
```
YAML
```yaml
name:ruleWithImageDetection
kind:recentActivity
enable:true
threshold:5
fetchBehavior:extension
```
**Perceptual Hashing** (`hash`) and **Pixel Comparisons** (`pixel`) may be used at the same time. Refer to the documentation below to see how they interact.
**Note:** Regardless of `fetchBehavior`, if the response from the URL does not indicate it is an image then image detection will not occur. IE Response `Content-Type` must contain `image`
## Prerequisites
Both image comparison systems require [Sharp](https://sharp.pixelplumbing.com/) as a dependency. Most modern operating systems running Node.js >= 12.13.0 do not require installing additional dependencies in order to use Sharp.
If you are using the docker image for ContextMod (`foxxmd/context-mod`) Sharp is built-in.
If you are installing ContextMod using npm then **Sharp should be installed automatically as an optional dependency.**
**If you do not want to install it automatically** install ContextMod with the following command:
```
npm install --no-optional
```
If you are using ContextMod as part of a larger project you may want to require Sharp in your own package:
```
npm install sharp@0.29.1 --save
```
# Comparison Systems
## Perceptual Hashing
[Perceptual Hashing](https://en.wikipedia.org/wiki/Perceptual_hashing) creates a text fingerprint of an image by:
* Dividing up the image into a grid
* Using an algorithm to derive a value from the pixels in each grid
* Adding up all the values to create a unique string (the "fingerprint")
An example of how a perceptual hash can work [can be found here.](https://www.hackerfactor.com/blog/?/archives/432-Looks-Like-It.html)
ContextMod uses [blockhash-js](https://github.com/commonsmachinery/blockhash-js) which is a javascript implementation of the algorithm described in the paper [Block Mean Value Based Image Perceptual Hashing by Bian Yang, Fan Gu and Xiamu Niu.](https://ieeexplore.ieee.org/document/4041692)
**Advantages**
* Low memory requirements and not CPU intensive
* Does not require any image transformations
* Hash results can be stored to make future comparisons even faster and skip downloading images (cached by url)
* Resolution-independent
**Disadvantages**
* Hash is weak when image differences are based only on color
// enable or disable hash comparisons (enabled by default)
"enable": true,
// determines accuracy of hash and granularity of hash comparison (comparison to other hashes)
// the higher the bits the more accurate the comparison
//
// NOTE: Hashes of different sizes (bits) cannot be compared. If you are caching hashes make sure all rules where results may be shared use the same bit count to ensure hashes can be compared. Otherwise hashes will be recomputed.
"bits": 32,
// default is 32 if not defined
//
// number of seconds to cache an image hash
"ttl": 60,
// default is 60 if not defined
//
// "High Confidence" Threshold
// If the difference in comparison is equal to or less than this number the images are considered the same and pixel comparison WILL NOT occur
//
// Defaults to the parent-level `threshold` value if not present
//
// Use null if you want pixel comparison to ALWAYS occur (softThreshold must be present)
"hardThreshold": 5,
//
// "Low Confidence" Threshold -- only used if `pixel` is enabled
// If the difference in comparison is:
//
// 1) equal to or less than this value and
// 2) the value is greater than `hardThreshold`
//
// the images will be compared using the `pixel` method
"softThreshold": 0,
},
//
// And above ^^^
//"pixel": {...}
}
//...
}
```
YAML
```yaml
name:ruleWithImageDetectionAndConfiguredHashing
kind:recentActivity
imageDetection:
enable:true
hash:
enable:true
bits:32
ttl:60
hardThreshold:5
softThreshold:0
```
## Pixel Comparison
This approach is as straight forward as it sounds. Both images are compared, pixel by pixel, to determine the difference between the two. ContextMod uses [pixelmatch](https://github.com/mapbox/pixelmatch) to do the comparison.
**Advantages**
* Extremely accurate, high-confidence on difference percentage
* Strong when comparing text-based images or color-only differences
**Disadvantages**
* High memory requirements (10-30MB per comparison) and CPU intensive
* Weak against similar images with different aspect ratios
* Requires image transformations (resize, crop) before comparison
* Can only store image-to-image results (no single image fingerprints)
**When should I use it?**
* Require very high accuracy in comparison results
* Comparing mostly text-based images or subtle color/detail differences
* As a secondary, high-confidence confirmation of comparison result after hashing
### How To Use
By default pixel comparisons **are not enabled.** They must be explicitly enabled in configuration.
Pixel comparisons will be performed in either of these scenarios:
* pixel is enabled, hashing is enabled and `hash.softThreshold` is defined
* When a comparison occurs that is less different than `softThreshold` but more different then `hardThreshold` (or `"hardThreshold": null`), then pixel comparison will occur as a high-confidence check
* Example
* hash comparison => 7% difference
*`"softThreshold": 10`
*`"hardThreshold": 4`
*`hash.enable` is `false` and `pixel.enable` is true
* hashing is skipped entirely and only pixel comparisons are performed
To configure pixel comparisons refer to this code block:
```json5
{
"name": "ruleWithImageDetectionAndPixelEnabled",
"kind": "recentActivity",
"imageDetection": {
//"hash": {...}
"pixel": {
// enable or disable pixel comparisons (disabled by default)
"enable": true,
// if the comparison difference percentage is equal to or less than this value the images are considered the same
//
// if not defined the value from imageDetection.threshold will be used
CM is composed of two applications that operate independently but are packaged together such that they act as one piece of software:
* **Server** -- Responsible for **running the bot(s)** and providing an API to retrieve information on and interact with them EX start/stop bot, reload config, retrieve operational status, etc.
* **Client** -- Responsible for serving the **web interface** and handling the bot oauth authentication flow between operators and subreddits/bots.
Both applications authenticate, and are primarily operated, by using [Reddit's API through OAuth.](https://github.com/reddit-archive/reddit/wiki/OAuth2) The **Client** uses OAuth to verify the identity of moderators logging into the web interface. The **Server** uses oauth tokens to interact with Reddit's API and operate all the configured bots.
In its default mode of operation CM takes care of all the interaction between **Server** and **Client** for you so that you can effectively treat it as a monolithic application. Learn more about CM's architecture and other operation modes in the [Server-Client Architecture documentation.](/docs/serverClientArchitecture.md)
The [Getting Started](/docs/operator/gettingStarted.md) guide serves as a straight-forward "how-to" for standing up a CM server from scratch with minimal explanation.
# [Installation](/docs/operator/installation.md)
CM has many installation options:
* Locally, from source, as a typescript project
* Built/pulled from a Docker image hosted on Dockerhub
* Deployed to Heroku with a Quick Deploy template (experimental)
Refer to the [Installation](/docs/operator/installation.md) docs for more information.
# Provisioning A Reddit Client
As mentioning in the [Overview](#overview), CM operates primarily using Reddit's API through OAuth. You must create a [Reddit Client](https://github.com/reddit-archive/reddit/wiki/OAuth2#getting-started) in order to interact with the API.
## Create Application
Visit [your reddit preferences](https://www.reddit.com/prefs/apps) and at the bottom of the page go through the **create an(other) app** process.
* Give it a **name**
* Choose **web app**
* If you know what you will use for **redirect uri** go ahead and use it, otherwise use `http://localhost:8085/callback`
Click **create app**.
Then write down your **Client ID, Client Secret, and Redirect Uri** somewhere
* A normal **reddit account** like `u/MyRedditAccount`
* Software that performs actions **on behalf of that reddit account** using Reddit's API
There is nothing special about the account! What's special is how its used -- through the API *with bot software* like ContextMod.
# Prerequisites
These things need to be done before a Bot can be added to CM:
* [Provisioned a Reddit Client](/docs/operator/README.md#provisioning-a-reddit-client)
* You or the person who controls the Bot account must have account credentials (username/password). Logging in to reddit is part of the setup process.
* If the bot does not exist **create a reddit account for it.**
* If the bot does exist make sure you are in communication with the owner of the account.
# Adding A Bot to CM
## CM OAuth Helper (Recommended)
This method will use CM's built in oauth flow. It is recommended because:
* It's easy!
* Will ensure your bot is authenticated with the correct oauth permissions
### Start CM with the Minimum Configuration (Initial Setup)
If this is your **first time adding a bot** you must make sure you have
* done the [prerequisites](#prerequisites)
* created a [minimum operator configuration](/docs/operator/configuration.md#minimum-config)
* that specifies the client id/secret from provisioning your reddit client
* specified **Operator Name** in the configuration
It is important you define **Operator Name** because the auth route is **protected.** You must login to CM's web interface in order to access the route.
### Create A Bot Invite
Open the CM web interface (default is [http://localhost:8085](http://localhost:8085)) and login with the reddit account specified in **Operator Name.**
If this is your first time setting up a bot you should be automatically redirected to the auth page. Otherwise, visit [http://localhost:8085/auth/helper](http://localhost:8085/auth/helper)
Follow the directions in the helper to create a **Bot Invite Link.**
### Onboard the Bot
Visit the **Bot Invite Link** while **logged in to reddit as the bot account** to begin the onboarding process. Refer to the [Onboarding Your Bot]() subreddit documentation for more information on this process.
At the end of the onboarding process the bot should be automatically added to your operator configuration. If there is an issue with automatically adding it then the oauth credentials will be displayed at the end of onboarding and can be [manually added to the configuration.](/docs/operator/configuration.md#manually-adding-a-bot)
## Aardvark OAuth Helper
This method should only be used if you cannot use the [CM OAuth Helper method.](#cm-oauth-helper-recommended)
* Visit [https://not-an-aardvark.github.io/reddit-oauth-helper/](https://not-an-aardvark.github.io/reddit-oauth-helper/) and follow the instructions given.
* **Note:** You will need to update the **redirect uri** you set when [provisioning your reddit client.](/docs/operator/README.md#provisioning-a-reddit-client)
* Input your **Client ID** and **Client Secret** in the text boxes with those names.
* Choose scopes. **It is very important you check everything on this list or CM may not work correctly**
* edit
* flair
* history
* identity
* modcontributors
* modflair
* modposts
* modself
* modnote
* mysubreddits
* read
* report
* submit
* wikiread
* wikiedit (if you are using Toolbox User Notes)
* Click **Generate tokens**, you will get a popup asking you to approve access (or login) -- **the account you approve access with is the account that Bot will control.**
* After approving an **Access Token** and **Refresh Token** will be shown at the bottom of the page. Use these to [manually add a bot to your operator configuration.](/docs/operator/configuration.md#manually-adding-a-bot)
* After adding the bot you will need to restart CM.
**Caching** is a major factor in CM's performance and optimization of Reddit API usage. Leveraging caching effectively in your operator configuration and in individual subreddit configurations can make or break your CM instance.
### What Is Cache?
A **Cache** is a storage medium with high **write** and **read** speed that is generally used to store **temporary, but frequently accessed data.**
## How CM Uses Caching
CM primarily **caches** two types of data:
### Reddit API Calls
#### How Reddit's API Works
In order to communicate with Reddit to retrieve posts, comments, user information, etc... CM uses API calls. Each API call is composed of a
* **request** -- CM asks Reddit for certain information
* **response** -- Reddit responds with the request information
[Reddit imposes an **api quota**](https://github.com/reddit-archive/reddit/wiki/API#rules) on every **individual account** using the API through an application. This quota is **600 requests per 10 minutes.** At the end of the 10 minutes period the quota is reset.
Additionally, some API calls have limits on how much data they can return. The most relevant of these is **user history can only be returned 100 activities (submission/comments) per API call**. EX if you want to get **500** activities from a user's history you will need to make **5** api calls.
#### Caching API Responses
In order to most effectively use the limited quota of API calls CM will **automatically cache API responses based on the "fingerprint" of the request sent.**
On an individual "item" basis that means these resources are always cached:
* General user information (name, karma, age, profile description, etc..)
* General subreddit information (name, nsfw, quarantined, etc...)
Additionally (and most importantly), responses for **user history** are cached **based on what was requested**. Example "fingerprint":
* username
* type of activities to retrieve (overview, only submissions, only comments)
* range of activities to retrieve (last 100, last 6 months, etc...)
If the above "fingerprint" is used in three different Rules then
* First fingerprint appearance -> CM make API call and caches response
* Second fingerprint appearance -> CM uses cached response
* Third fingerprint appearance -> CM uses cached response
So only **one** API call is made even though the history is used three times.
It is therefore **important to re-use window criteria** wherever possible to take advantage of this caching.
### Rules and Filters
Once CM has processed a Rule or Filter (`itemIs` or `authorIs`) the results of that component is stored in cache. For Rules the result is stored for the lifecycle of the Activity being processed and then discarded. For Filters the result is stored for a short time in cache and can be re-used by other Activities.
Re-using Rules and Filters by either using the exact same configuration or by using **names** provides:
* A major performance benefit since these do not need to be re-calculated
* A low-to-medium optimization on API caching, depending on what criteria are being tested.
In general re-use should always be a goal.
# Configuration
## Cache Provider
CM supports two cache **providers**. By default all providers use `memory`:
*`memory` -- in-memory (non-persistent) backend
* Cache will be lost when CM is restarted/exits
*`redis` -- [Redis](https://redis.io/) backend
Each `provider` object in configuration can be specified as:
* one of the above **strings** to use the **defaults settings** or
* an **object** with keys to override default settings
[Refer to full documentation on cache providers in the schema](https://json-schema.app/view/%23/%23%2Fdefinitions%2FOperatorCacheConfig/%23%2Fdefinitions%2FCacheOptions?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
Some examples:
```json5
{
"provider": {
"store": "memory", // one of "memory" or "redis"
"ttl": 60, // the default max age of a key in seconds
"max": 500, // the maximum number of keys in the cache (for "memory" only)
// the below properties only apply to 'redis' provider
"host": 'localhost',
"port": 6379,
"auth_pass": null,
"db": 0,
}
}
```
YAML
```yaml
provider:
store:redis
ttl:60
max:500
host:localhost
port:6379
auth_pass:null
db:0
```
Providers can be specified in multiple locations, with each more specific location overriding the parent-level config:
* top-level config
* in individual bot configurations
* in the web config
```yaml
operator:
name:example
# top level
caching:
provider:
...
bots:
- name:u/MyBot
# overrides top level
caching:
provider:
...
web:
# overrides top level
caching:
provider:
...
```
## Cache TTL
The **Time To Live (TTL)** -- how long data may live in the cache before "expiring" -- can be controlled indepedently for different data types. Sane defaults are provided for all types but tweaking these can improve API caching optimization depending on the subreddit's configuration (use case).
Each of these can be specified in the `caching` property. TTL is measured in seconds.
*`authorTTL` (default 60) -- user activity (overview, submissions, comments)
*`filterCriteriaTTL` (default 60) -- filter results (`itemIs` and `authorIs`)
*`selfTTL` (default 60) -- actions performed by the bot (creating comment reply, report). If action is in cache the bot ignores it if found during polling.
* This helps prevent bot from reacting to things it did itself IE you don't want it to remove a comment because it reported the comment itself
*`subredditTTL` (default 60) -- general information on fetched subreddit
*`userNotesTTL` (default 300) -- Amount of time [Toolbox User Notes](https://www.reddit.com/r/toolbox/wiki/docs/usernotes) are cached
*`wikiTTL` (default 300) -- Wiki pages used for content (in messages, reports, bans, etc...) are cached for this amount of time
The **Operator** configuration refers to configuration used configure to the actual application/bot. This is different
from the **Subreddit** configuration that is defined in each Subreddit's wiki and determines the rules/actions for
activities the Bot runs on.
**The full documentation** for all options in the operator configuration can be found [**here in the operator schema.**](https://json-schema.app/view/%23?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
CM can be configured using **any or all** of the approaches below. **It is recommended to use FILE ([File Configuration](#file-configuration-recommended))**
Any values defined at a **lower-listed** level of configuration will override any values from a higher-listed
configuration.
* **ENV** -- Environment variables loaded from an [`.env`](https://github.com/toddbluhm/env-cmd) file (path may be
specified with `--file` cli argument)
***ENV** -- Any already existing environment variables (exported on command line/terminal profile/etc.)
***FILE** -- Values specified in a YAML/JSON configuration file using the structure [in the schema](https://json-schema.app/view/%23?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
* When reading the **schema** if the variable is available at a level of configuration other than **FILE** it will be
noted with the same symbol as above. The value shown is the default.
***ARG** -- Values specified as CLI arguments to the program (see [ClI Usage](#cli-usage) below)
## File Configuration (Recommended)
Using a file has many benefits over using ARG or ENV:
* CM can automatically update your configuration
* CM can automatically add bots via the [CM OAuth Helper](/docs/operator/addingBot.md#cm-oauth-helper-recommended)
* CM has a built-in configuration editor that can help you build and validate your configuration file
* File config is **required** if adding multiple bots to CM
### Specify File Location
By default CM will look for `config.yaml` or `config.json` in the `DATA_DIR` directory:
* [Local installation](/docs/operator/installation.md#locally) -- `DATA_DIR` is the root of your installation directory (same folder as `package.json`)
* [Docker](/docs/operator/installation.md#docker-recommended) -- `DATA_DIR` is at `/config` in the container
The `DATA_DIR` directory can be changed by passing `DATA_DIR` as an environmental variable EX `DATA_DIR=/path/to/directory`
The name of the config file can be changed by passing `OPERATOR_CONFIG` as an environmental variable:
* As filename -- `OPERATOR_CONFIG=myConfig.yaml` -> CM looks for `/path/to/directory/myConfig.yaml`
* As absolute path -- `OPERATOR_CONFIG=/a/path/myConfig.yaml` -> CM looks for `/a/path/myConfig.yaml`
[**Refer to the Operator Config File Schema for full documentation**](https://json-schema.app/view/%23?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
### Defining Multiple Bots or CM Instances
One ContextMod instance can
* Run multiple bots (multiple reddit accounts -- each as a bot)
* Connect to many other, independent, ContextMod instances
However, the default configuration (using **ENV/ARG**) assumes your intention is to run one bot (one reddit account) on one CM instance without these additional features. This is to make this mode of operation easier for users with this intention.
To take advantage of this additional features you **must** use a **FILE** configuration. Learn about how this works and how to configure this scenario in the [Architecture Documentation.](/docs/serverClientArchitecture.md)
## CLI Usage
Running CM from the command line is accomplished with the following command:
```bash
node src/index.js run
```
Run `node src/index.js run help` to get a list of available command line options (denoted by **ARG** above):
<details>
```
Usage: index [options] [command]
Options:
-h, --help display help for command
Commands:
run [options] [interface] Monitor new activities from configured subreddits.
check [options] <activityIdentifier> [type] Run check(s) on a specific activity
unmoderated [options] <subreddits...> Run checks on all unmoderated activity in the modqueue
help [command] display help for command
Options:
-c, --operatorConfig <path> An absolute path to a JSON file to load all parameters from (default: process.env.OPERATOR_CONFIG)
-i, --clientId <id> Client ID for your Reddit application (default: process.env.CLIENT_ID)
-e, --clientSecret <secret> Client Secret for your Reddit application (default: process.env.CLIENT_SECRET)
-a, --accessToken <token> Access token retrieved from authenticating an account with your Reddit Application (default: process.env.ACCESS_TOKEN)
-r, --refreshToken <token> Refresh token retrieved from authenticating an account with your Reddit Application (default: process.env.REFRESH_TOKEN)
-u, --redirectUri <uri> Redirect URI for your Reddit application (default: process.env.REDIRECT_URI)
-t, --sessionSecret <secret> Secret use to encrypt session id/data (default: process.env.SESSION_SECRET || a random string)
-s, --subreddits <list...> List of subreddits to run on. Bot will run on all subs it has access to if not defined (default: process.env.SUBREDDITS)
-d, --logDir [dir] Absolute path to directory to store rotated logs in. Leaving undefined disables rotating logs (default: process.env.LOG_DIR)
-l, --logLevel <level> Minimum level to log at (default: process.env.LOG_LEVEL || verbose)
-w, --wikiConfig <path> Relative url to contextbot wiki page EX https://reddit.com/r/subreddit/wiki/<path> (default: process.env.WIKI_CONFIG || 'botconfig/contextbot')
--snooDebug Set Snoowrap to debug. If undefined will be on if logLevel='debug' (default: process.env.SNOO_DEBUG)
--authorTTL <s> Set the TTL (seconds) for the Author Activities shared cache (default: process.env.AUTHOR_TTL || 60)
--heartbeat <s> Interval, in seconds, between heartbeat checks. (default: process.env.HEARTBEAT || 300)
--softLimit <limit> When API limit remaining (600/10min) is lower than this subreddits will have SLOW MODE enabled (default: process.env.SOFT_LIMIT || 250)
--hardLimit <limit> When API limit remaining (600/10min) is lower than this all subreddit polling will be paused until api limit reset (default: process.env.SOFT_LIMIT || 250)
--dryRun Set all subreddits in dry run mode, overriding configurations (default: process.env.DRYRUN || false)
--proxy <proxyEndpoint> Proxy Snoowrap requests through this endpoint (default: process.env.PROXY)
--operator <name...> Username(s) of the reddit user(s) operating this application, used for displaying OP level info/actions in UI (default: process.env.OPERATOR)
--operatorDisplay <name> An optional name to display who is operating this application in the UI (default: process.env.OPERATOR_DISPLAY || Anonymous)
-p, --port <port> Port for web server to listen on (default: process.env.PORT || 8085)
-q, --shareMod If enabled then all subreddits using the default settings to poll "unmoderated" or "modqueue" will retrieve results from a shared request to /r/mod (default: process.env.SHARE_MOD || false)
-h, --help display help for command
```
</details>
# Minimum Configuration
The minimum configuration required to run CM assumes you have no bots and want to use CM to [add your first bot.](/docs/operator/addingBot.md#cm-oauth-helper-recommended)
You will need have this information available:
* From [provision a reddit client](/docs/operator/README.md#provisioning-a-reddit-client)
* Client ID
* Client Secret
* Redirect URI (if different from default `http://localhost:8085/callback`)
* Operator Name -- username of the reddit account you want to use to administer CM with
See the [**example minimum configuration** below.](#minimum-config)
This configuration can also be **generated** by CM if you start CM with **no configuration defined** and visit the web interface.
# Bots
Configured using the `bots` top-level property. Bot configuration can override and specify many more options than are available at the operator-level. Many of these can also set the defaults for each subreddit the bot runs:
* Of the subreddits this bot moderates, specify a subset of subreddits to run or exclude from running
* default caching behavior
* control the soft/hard api usage limits
* Flow Control defaults
* Filter Criteria defaults
* default Polling behavior
[Full documentation for all bot instance options can be found in the schema.](https://json-schema.app/view/%23/%23%2Fdefinitions%2FBotInstanceJsonConfig?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
## Adding A Bot
If you use the [CM OAuth Helper](/docs/operator/addingBot.md#cm-oauth-helper-recommended) and it works successfully then the configuration for the Bot will be automatically added.
### Manually Adding a Bot
Add a new *object* to the `bots` property at the top-level of your configuration. If `bots` does not exist create it now.
Minimum information required for a valid bot:
* Client Id
* Client Secret
* Refresh Token
* Access Token
<details>
<summary>Example</summary>
```yaml
operator:
name:YourRedditUsername
bots:
- name:u/MyRedditBot# name is optional but highly recommend for readability in both config and web interface
credentials:
reddit:
clientId:f4b4df1c7b2
clientSecret:34v5q1c56ub
accessToken:34_f1w1v4
refreshToken:p75_1c467b2
web:
credentials:
clientId:f4b4df1c7b2
clientSecret:34v5q1c56ub
redirectUri:'http://localhost:8085/callback'
```
</details>
# Web Client
Configured using the `web` top-level property. Allows specifying settings related to:
* UI port
* Database and caching connection, if different from global settings
* Session max age and secret
* Invite max age
* Connections to CM API instances (if using multiple)
[Full documentation for all web settings can be found in the schema.](https://json-schema.app/view/%23/%23%2Fproperties%2Fweb?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
# Example Configurations
## Minimum Config
Below are examples of the minimum required config to run the application using all three config approaches independently.
Using **FILE**
<details>
See [Specify File Location](#specify-file-location) for where this file would be located.
YAML (`config.yaml`)
```yaml
operator:
name:YourRedditUsername
web:
credentials:
clientId:f4b4df1c7b2
clientSecret:34v5q1c56ub
redirectUri:'http://localhost:8085/callback'
```
JSON (`config.json5`)
```json5
{
"operator": {
"name": "YourRedditUsername"
},
"web": {
"credentials": {
"clientId": "f4b4df1c7b2",
"clientSecret": "34v5q1c56ub",
"redirectUri": "http://localhost:8085/callback"
}
}
}
```
</details>
Using **ENV** (`.env`)
<details>
```
OPERATOR=YourRedditUsername
CLIENT_ID=f4b4df1c7b2
CLIENT_SECRET=34v5q1c56ub
REDIRECT_URI=http://localhost:8085/callback
```
</details>
Using **ARG**
<details>
```
node src/index.js run --clientId=f4b4df1c7b2 --clientSecret=34v5q1c56ub --redirectUri=http://localhost:8085/callback
```
</details>
## Using Config Overrides
An example of using multiple configuration levels together IE all are provided to the application:
**FILE**
<details>
```json
{
"logging":{
"level":"debug"
}
}
```
YAML
```yaml
logging:
level:debug
```
</details>
**ENV** (`.env`)
<details>
```
CLIENT_SECRET=34v5q1c56ub
SUBREDDITS=sub1,sub2,sub3
PORT=9008
```
</details>
**ARG**
<details>
```
node src/index.js run --subreddits=sub1 --clientId=34v5q1c56ub
```
</details>
When all three are used together they produce these variables at runtime for the application:
```
clientId: f4b4df1c7b2
clientSecret: 34v5q1c56ub
subreddits: sub1
port: 9008
log level: debug
```
## Configuring Client for Many Instances
See the [Architecture Docs](/docs/serverClientArchitecture.md) for more information.
<details>
YAML
```yaml
bots:
- credentials:
clientId:f4b4df1c7b2
clientSecret:34v5q1c56ub
refreshToken:34_f1w1v4
accessToken:p75_1c467b2
web:
credentials:
clientId:f4b4df1c7b2
clientSecret:34v5q1c56ub
redirectUri:'http://localhost:8085/callback'
clients:
# server application running on this same CM instance
- host:'localhost:8095'
secret:localSecret
# a server application running somewhere else
- host:'mySecondContextMod.com:8095'
secret:anotherSecret
api:
secret:localSecret
```
JSON
```json5
{
"bots": [
{
"credentials": {
"clientId": "f4b4df1c7b2",
"clientSecret": "34v5q1c56ub",
"refreshToken": "34_f1w1v4",
"accessToken": "p75_1c467b2"
}
}
],
"web": {
"credentials": {
"clientId": "f4b4df1c7b2",
"clientSecret": "34v5q1c56ub",
"redirectUri": "http://localhost:8085/callback"
},
"clients": [
// server application running on this same CM instance
{
"host": "localhost:8095",
"secret": "localSecret"
},
// a server application running somewhere else
{
// api endpoint and port
"host": "mySecondContextMod.com:8095",
"secret": "anotherSecret"
}
]
},
"api": {
"secret": "localSecret",
}
}
```
</details>
# Cache Configuration
See the [Cache Configuration](/docs/operator/caching.md) documentation.
# Database Configuration
See the [Database Configuration](/docs/operator/database.md) documentation.
* **Recorded Events** -- an "audit" of how CM processed an Activity (Comment/Submission) and what actions it took based on the result of processing it (comment, report, remove, etc.)
* Persistent Settings/Data
* Settings like the last known state of a Subreddit's bot before the application exited
* Web Client sessions and invites -- stuff that should survive a restart
* Statistics
* All-time and time-series high-level statistics like # of events, # of checks run, etc...
CM does NOT store subreddit configurations or any runtime alterations of these configurations. This is to keep configurations **portable** -- on principle, if you (a moderator) choose to use a different CM instance to run your subreddit's bot then it should not function differently.
# Providers
CM uses [TypeORM](https://typeorm.io/) as the database access layer and specifically supports three database types:
* SQLite -- using either [SQL.js](https://sql.js.org) or native SQLite through [better-sqlite3](https://github.com/JoshuaWise/better-sqlite3)
* MySQL/MariaDB
* Postgres
The database configuration is specified in the top-level `databaseConfig.connection` property in the operator configuration. EX:
```yaml
operator:
name:u/MyRedditAccount
databaseConfig:
connection:
...
```
## SQLite
When using a [local installation](/docs/installation.md#locally) the default database is `sqljs`, which requires no binary dependencies. When using [docker](/docs/operator/installation.md#docker-recommended) the default is `better-sqlite3`.
**NOTE:** It is **NOT RECOMMENDED** to use `sqljs` in a production environment for performance reasons. You should at least switch to `better-sqlite3` or preferably MySql/Postgres.
For both sqlite types, if no database/location is specified, it will be created in the [`DATA_DIR` directory.](/docs/operator/configuration.md#specify-file-location)
If CM detects it cannot **read and write** to the database files, or directory if no files exist, it will fallback to using an in-memory database that will be lost when CM restarts. If you have trouble with r/w permissions and are using docker make sure [file permissions are correct for your mounted volume.](/docs/operator/installation.md#linux-host)
The `database` and `schema` you specify should exist before using CM.
# Migrations
CM implements database migrations. On startup it will check for any pending migrations. If the database doesn't exist (sqlite) or is empty or no tables conflict it will automatically execute migrations.
If there is any kind of conflict it will **pause startup** and display a prompt in the user interface to confirm migration execution. **You should always backup your database before running migrations.**
To force CM to always run migrations without confirmation set `force` to `true` in the `migrations` property within `databaseConfig`:
```yaml
databaseConfig:
migrations:
force:true# always run migrations
```
### SQLite
When using a SQLite driver CM can create automatic backups for you. Another prompt will be displayed on the migrations page in the web interface to make a copy of your database. You can make CM automatically backup and continue with migrations like so:
```yaml
databaseConfig:
migrations:
continueOnAutomatedBackup:true# try to backup sqlite files automatically and continue with migrations if successful
```
# Recorded Event Retention
The **Recorded Events** CM stores in the database can be controlled per subreddit. By default events will be stored indefinitely.
A **Retention Policy** is a metric to determine what "range" of Recorded Events CM should keep in the database. It can be either:
* A **number** -- The last X number of Recorded Events will be kept
* EX `1000` -> Keep the last 1000 events and discard any others.
* A **duration** -- A time period, starting from now until X `duration` in the past, for which events will be kept
* EX `3 months` -> Keep all events created between now and 3 months ago. Anything older than 3 months ago will be discarded.
The Retention Policy can be specified at operator level, bot, subreddit *override*, and subreddit configuration level:
```yaml
operator:
name:u/MyRedditAccount
databaseConfig:
retention:'3 months'# each subreddit will retain 3 more of recorded events
bots:
# all subreddits this bot moderates will have 3 month retention
- name:u/OneBotAccount
credentials:
...
subreddits:
overrides:
- name:aSpecialSubredit
databaseConfig:
# overrides retention for THIS SUBBREDIT ONLY, will retain last 1000 events
# -- also overrides any retention set in the subreddit's actual configuration
retention:1000
- name:u/TwoBotAccount
credentials:
...
databaseConfig:
retention:'1 month'# overrides top-level rentention for all subeddits this bot moderates
```
In a subreddit's config:
```yaml
polling:
- unmoderated
# will retain last 2000 events
# -- will override top-level operator retention or bot-specific retention
# -- will NOT override a subreddit override specified in bot coniguration
retention:2000
runs:
...
```
# Influx
ContextMod supports writing detailed time-series data to [InfluxDB](https://www.influxdata.com/).
This data can be used to monitor the overall health, performance, and metrics for a ContextMod server. Currently, this data can **only be used by an Operator** as it requires access to the operator configuration and CM instance.
CM supports InfluxDB OSS > 2.3 or InfluxDB Cloud.
**Note:** This is an **advanced feature** and assumes you have enough technical knowledge to follow the documentation provided by each application to deploy and configure them. No support is guaranteed for installation, configuration, or use of Influx and Grafana.
* [Configure InfluxDB using the UI](https://docs.influxdata.com/influxdb/v2.3/install/#set-up-influxdb-through-the-ui)
* You will need **Username**, **Password**, **Organization Name**, and **Bucket Name** later for Grafana setup so make sure to record them somewhere
* [Create a Token](https://docs.influxdata.com/influxdb/v2.3/security/tokens/create-token/) with enough permissions to write/read to the bucket you configured
* After the token is created **view/copy the token** to clipboard by clicking the token name. You will need this for Grafana setup.
### ContextMod
Add the following block to the top-level of your operator configuration:
```yaml
influxConfig:
credentials:
url:'http://localhost:8086'# URL to your influx DB instance
token:'9RtZ5YZ6bfEXAMPLENJsTSKg=='# token created in the previous step
org:MyOrg# organization created in the previous step
bucket:contextmod# name of the bucket created in the previous step
```
## Grafana
A pre-built dashboard for [Grafana](https://grafana.com) can be imported to display overall metrics/stats using InfluxDB data.

* Create a new Data Source using **InfluxDB** type
* Choose **Flux** for the **Query Language**
* Fill in the details for **URL**, **Basic Auth Details** and **InfluxDB Details** using the data you created in the [Influx Setup step](#influxdb-oss)
* Set **Min time interval** to `60s`
* Click **Save and test**
* Import Dashboard
* **Browse** the Dashboard pane
* Click **Import** and **upload** the [grafana dashboard json file](/docs/operator/grafana.json)
* Chose the data source you created from the **InfluxDB CM** dropdown
* Click **Import**
The dashboard can be filtered by **Bots** and **Subreddits** dropdowns at the top of the page to get more specific details.
This getting started guide is for **Operators** -- that is, someone who wants to run the actual software for a ContentMod bot. If you are a **Moderator** check out the [moderator getting started](/docs/subreddit/gettingStarted.md) guide instead.
# Table of Contents
* [Installation](#installation)
* [Create a Reddit Client](#create-a-reddit-client)
* [Start ContextMod](#start-contextmod)
* [Add a Bot to CM](#add-a-bot-to-cm)
* [Access The Dashboard](#access-the-dashboard)
* [What's Next?](#whats-next)
# Installation
Follow the [installation](/docs/operator/installation.md) documentation. It is recommended to use **Docker** since it is self-contained.
# Create a Reddit Client
[Create a reddit client](/docs/operator/README.md#provisioning-a-reddit-client)
# Start ContextMod
Start CM using the example command from your [installation](#installation) and visit http://localhost:8085
The First Time Setup page will ask you to input:
* Client ID (from [Create a Reddit Client](#create-a-reddit-client))
* Client Secret (from [Create a Reddit Client](#create-a-reddit-client))
* Operator -- this is the username of your main Reddit account.
**Write Config** and then restart CM. You have now created the [minimum configuration](/docs/operator/configuration.md#minimum-configuration) required to run CM.
# Add A Bot to CM
You should automatically be directed to the [Bot Invite Helper](/docs/operator/addingBot.md#cm-oauth-helper-recommended) used to authorize and add a Bot to your CM instance.
Follow the directions here and **create an Authorization Invite** at the bottom of the page.
Next, login to Reddit with the account you will be using as the Bot and then visit the **Authorization Invite** link you created. Follow the steps there to finish adding the Bot to your CM instance.
# Access The Dashboard
Congratulations! You should now have a fully authenticated bot running on a ContextMod instance.
In order for your Bot to operate in a subreddit it **must be a moderator in that subreddit.** This may be your own subreddit or someone else's.
To monitor the behavior of bots running on your instance visit http://localhost:8085.
# What's Next?
As an operator you should familiarize yourself with how the [operator configuration](/docs/operator/configuration.md) you made works. This will help you understand how to get the most of your CM instance by leveraging the [Cache](/docs/oeprator/caching.md) and [Database](/docs/operator/database.md) effectively as well as provide you will all possible options for configuring CM using the [schema.](https://json-schema.app/view/%23?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json)
If you are also the moderator of the subreddit the bot will be running you should check out the [moderator getting started guide.](/docs/subreddit/gettingStarted.md#setup-wiki-page)
You might also be interested in these [quick tips for using the web interface](/docs/webInterface.md). Additionally, on the dashboard click the **Help** button at the top of the page to get a guided tour of the dashboard.
In order to run a ContextMod instance you must first you must install it somewhere.
ContextMod can be run on almost any operating system but it is recommended to use Docker due to ease of deployment.
## Docker (Recommended)
PROTIP: Using a container management tool like [Portainer.io CE](https://www.portainer.io/products/community-edition) will help with setup/configuration tremendously.
An example of starting the container using the [minimum configuration](/docs/operator/configuration.md#minimum-config):
* Bind the directory where your config file, logs, and database are located on your host machine into the container's default `DATA_DIR` by using `-v /host/path/folder:/config`
* Note: **You must do this** or else your configuration will be lost next time your container is updated.
* Expose the web interface using the container port `8085`
```
docker run -d -v /host/path/folder:/config -p 8085:8085 ghcr.io/foxxmd/context-mod:latest
```
The location of `DATA_DIR` in the container can be changed by passing it as an environmental variable EX `-e "DATA_DIR=/home/abc/config`
### Linux Host
**NOTE:** If you are using [rootless containers with Podman](https://developers.redhat.com/blog/2020/09/25/rootless-containers-with-podman-the-basics#why_podman_) this DOES NOT apply to you.
If you are running Docker on a **Linux Host** you must specify `user:group` permissions of the user who owns the **configuration directory** on the host to avoid [docker file permission problems.](https://ikriv.com/blog/?p=4698) These can be specified using the [environmental variables **PUID** and **PGID**.](https://docs.linuxserver.io/general/understanding-puid-and-pgid)
To get the UID and GID for the current user run these commands from a terminal:
An example of running CM using the [minimum configuration](/docs/operator/configuration.md#minimum-config) with a [configuration file](/docs/operator/configuration.md#file-configuration-recommended):
This template provides a **web** and **worker** dyno for heroku.
* **Web** -- Will run the bot **and** the web interface for ContextMod.
* **Worker** -- Will run **just** the bot.
Be aware that Heroku's [free dyno plan](https://devcenter.heroku.com/articles/free-dyno-hours#dyno-sleeping) enacts some limits:
* A **Web** dyno will go to sleep (pause) after 30 minutes without web activity -- so your bot will ALSO go to sleep at this time
* The **Worker** dyno **will not** go to sleep but you will NOT be able to access the web interface. You can, however, still see how Cm is running by reading the logs for the dyno.
If you want to use a free dyno it is recommended you perform first-time setup (bot authentication and configuration, testing, etc...) with the **Web** dyno, then SWITCH to a **Worker** dyno so it can run 24/7.
# Memory Management
Node exhibits [lazy GC cleanup](https://github.com/FoxxMD/context-mod/issues/90#issuecomment-1190384006) which can result in memory usage for long-running CM instances increasing to unreasonable levels. This problem does not seem to be an issue with CM itself but with Node's GC approach. The increase does not affect CM's performance and, for systems with less memory, the Node *should* limit memory usage based on total available.
In practice CM uses ~130MB for a single bot, single subreddit setup. Up to ~350MB for many (10+) bots or many (20+) subreddits.
If you need to reign in CM's memory usage for some reason this can be addressed by setting an upper limit for memory usage with `node` args by using either:
**--max_old_space_size=**
Value is megabytes. This sets an explicit limit on GC memory usage.
This is set by default in the [Docker](#docker-recommended) container using the env `NODE_ARGS` to `--max_old_space_size=512`. It can be disabled by overriding the ENV.
**--optimize_for_size**
Tells Node to optimize for (less) memory usage rather than some performance optimizations. This option is not memory size dependent. In practice performance does not seem to be affected and it reduces (but not entirely prevents) memory increases over long periods.
ContextMod's high-level functionality is separated into two **independently run** applications.
Each application consists of an [Express](https://expressjs.com/) web server that executes the core logic for that application and communicates via HTTP API calls:
Applications:
* **Server** -- Responsible for **running the bots** and providing an API to retrieve information on and interact with them EX start/stop bot, reload config, retrieve operational status, etc.
* **Client** -- Responsible for serving the **web interface** and handling the bot oauth authentication flow between operators and moderators.
Both applications operate independently and can be run individually. The determination for which is run is made by environmental variables, operator config, or cli arguments.
# Authentication
Communication between the applications is secured using [Json Web Tokens](https://github.com/mikenicholson/passport-jwt) signed/encoded by a **shared secret** (HMAC algorithm). The secret is defined in the operator configuration.
# Configuration
## Default Mode
**ContextMod is designed to operate in a "monolith" mode by default.**
This is done by assuming that when configuration is provided by **environmental variables or CLI arguments** the user's intention is to run the client/server together with only one bot, as if ContextMod is a monolith application. When using these configuration types the same values are parsed to both the server/client to ensure interoperability/transparent usage for the operator. Some examples of this in the **operator configuration**:
* The **shared secret** for both client/secret cannot be defined using env/cli -- at runtime a random string is generated that is set for the value `secret` on both the `api` and `web` properties.
* The `bots` array cannot be defined using env/cli -- a single entry is generated by the configuration parser using the combined values provided from env/cli
* The `PORT` env/cli argument only applies to the `client` wev server to guarantee the default port for the `server` web server is used (so the `client` can connect to `server`)
**The end result of this default behavior is that an operator who does not care about running multiple CM instances does not need to know or understand anything about the client/server architecture.**
## Server
To run a ContextMod instance as **sever only (headless):**
The relevant sections of the **operator configuration** for the **Server** are:
* [`operator.name`](https://json-schema.app/view/%23/%23%2Fproperties%2Foperator?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FOperatorConfig.json) -- Define the reddit users who will be able to have full access to this server regardless of moderator status
This section is for **reddit moderators**. It covers how to use a CM bot for your subreddit.
If you are trying to run a ContextMod instance (the actual software) please refer to the [operator section](/docs/operator/README.md).
# Table of Contents
* [Overview](#overview)
* [Your Relationship to CM](#your-relationship-to-cm)
* [Operator](#operator)
* [Your Bot](#your-bot)
* [Getting Started](#getting-started)
* [Accessing The Bot](#accessing-the-bot)
* [Editing The Bot](#editing-the-bot)
* [Configuration](#configuration)
* [Guest Access](#guest-access)
# Overview
The Context Mod **software** can manage multiple **bots** (reddit accounts used as bots, like `/u/MyCMBot`). Each bot can manage (run) multiple **subreddits** which is determined by the subreddits the account is a moderator of.
You, the moderator of a subreddit a CM bot runs in, can access/manage the Bot using the CM software's [web interface](/docs/images/subredditStatus.jpg) and control its behavior using the [web editor.](/docs/images/editor.jpg)
## Your Relationship to CM
It is important to understand the relationship between you (the moderator), the bot, and the operator (the person running the CM software).
The easiest way to think about this is in relation to how you use Automoderator and interact with Reddit as a moderator. As an analogy:
### Operator
The operator is the person running the actual server/machine the Context Mod software is on.
They are best thought of as **Reddit:**
* Mostly hands-off when it comes to the bot and interacting with your subreddit
* You must interact with Reddit first before you can use automoderator (login, create a subreddit, etc...)
Unlike reddit, though, there is a greater level of trust required between you and the Operator because what you make the Bot do ultimately affects the Operator since they are the ones actually running your Bot and making API calls to reddit.
### Your Bot
Your bot is like an **invite-only version of Automoderator**:
* Unlike automoderator, you **must** interact with the Operator in order to get the bot working. It is not public for anyone to use.
* Like automoderator, you **must** create a [configuration](/docs/subreddit/components/README.md) for it do anything.
* The bot does not come pre-configured for you. It is a blank slate and requires user input to be useful.
* Also like automoderator, you are **entirely in control of the bot.**
* You can start, stop, and edit its behavior at any time without needing to communicate with the Operator.
* CM provides you _tools_, different ways the Bot can detect patterns in your subreddit/users as well as actions it can, and you can decide to use them however you want.
* Your bot is **only accessible to moderators of your subreddit.**
# Getting Started
The [Getting Started](/docs/subreddit/gettingStarted.md) guide lays out the steps needed to go from nothing to a working Bot. If you are a moderator new to Context Mod this is where you want to begin.
# Accessing The Bot
All bot management and editing is done through the [web interface.](/docs/images/subredditStatus.jpg) The URL used for accessing this interface is given to you by the **Operator** once they have agreed to host your bot/subreddit.
NOTE: This interface is **only access to moderators of your subreddit** and [guests.](#guest-access) You must login to the web interface **with your moderator account** in order to access it.
A **guided tour** that helps show how to manage the bot at a high-level is available on the web interface by clicking the **Help** button in the top-right of the page.
## Editing The Bot
Find the [editor in the web interface](/docs/webInterface.md#editingupdating-your-config) to access the built-in editor for the bot.
[The editor](/docs/images/editor.jpg) should be your all-in-one location for viewing and editing your bot's behavior. **It is equivalent to Automoderator's editor page.**
The editor features:
* syntax validation and highlighting
* configuration auto-complete and documentation (hover over properties)
* built-in validation using Microsoft Word "squiggly lines" indicators and an error list at the bottom of the window
* built-in saving (at the top of the window)
# Configuration
Use the [Configuration Reference](/docs/subreddit/components/README.md) to learn about all the different components available for building a CM configuration.
Additionally, refer to [How It Works](/docs/README.md#how-it-works) and [Core Concepts](/docs/README.md#concepts) to learn the basic of CM configuration.
After you have the basics under your belt you could use the [subreddit-reddit example configurations](/docs/subreddit/components/subredditReady) to familiarize yourself with a complete configuration and ways to use CM.
# Guest Access
CM supports **Guest Access**. Reddit users who are given Guest Access to your bot are allowed to access the web interface even though they are not moderators.
Additionally, they can edit the subreddit's config using the bot. If a Guest edits your config their username will be mentioned in the wiki page edit reason.
Guests can do everything a regular mod can except view/add/remove Guest. They can be removed at any time or set with an expiration date that their access is removed on.
**Guests are helpful if you are new to CM and know reddit users that can help you get started.**
[Add guests from the Subreddit tab in the main interface.](/docs/images/guests.jpg)
Actions that can submit text (Report, Comment, UserNote, Message, Ban, Submission) will have their `content` values run through a [Mustache Template](https://mustache.github.io/). This means you can insert data generated by Rules into your text before the Action is performed.
See here for a [cheatsheet](https://gist.github.com/FoxxMD/d365707cf99fdb526a504b8b833a5b78) and [here](https://www.tsmean.com/articles/mustache/the-ultimate-mustache-tutorial/) for a more thorough tutorial.
# Template Data
Some data can always be accessed at the top-level. Example
```
This action was run from {{manager}} in Check {{check}}.
The bot intro post is {{botLink}}
Message the moderators of this subreddit using this [compose link]({{modmailLink}})
| `manager` | The name of the subreddit the bot is running in | mealtimevideos |
| `check` | The name of the Check that was triggered | myCheck |
| `botLink` | A link to the bot introduction | https://www.reddit.com/r/ContextModBot/comments/otz396/introduction_to_contextmodbot |
| `modmailLink` | A link that opens reddit's DM compose with the subject line as the Activity being processed | https://www.reddit.com/message/compose?to=/r/mealtimevideos&message=https://www.reddit.com/r/ContextModBot/comments/otz396/introduction_to_contextmodbot |
## Activity Data
**Activity data can be accessed using the `item` variable.** Example
```
This activity is a {{item.kind}} with {{item.votes}} votes, created {{item.age}} ago.
```
Produces:
> This activity is a submission with 10 votes created 5 minutes ago.
### Common
All Actions with `content` have access to this data:
| `reports` | The number of reports recieved | 1 |
| `modReports` | The number of reports by moderators | 1 |
| `userReports` | The number of reports by users | 1 |
## Rule Data
### Summary
A summary of what rules were processed and which were triggered, with results, is available using the `ruleSummary` variable. Example:
```
A summary of rules processed for this activity:
{{ruleSummary}}
```
Would produce:
> A summary of rules processed for this activity:
>
> * namedRegexRule - ✘
> * nameAttributionRule - ✓ - 1 Attribution(s) met the threshold of < 20%, with 1 (3%) of 32 Total -- window: 6 months
> * noXPost ✓ - ✓ 1 of 1 unique items repeated <= 3 times, largest repeat: 1
### Individual
Individual **Rules** can be accessed using the name of the rule, **lower-cased, with all spaces/dashes/underscores.** Example:
```
Submission was repeated {{rules.noxpost.largestRepeat}} times
```
Produces
> Submission was repeated 7 times
## Action Data
### Summary
A summary of what actions have already been run **when the template is rendered** is available using the `actionSummary` variable. It is therefore important that the Action you want to produce the summary is run **after** any other Actions you want to get a summary for.
Example:
```
A summary of actions processed for this activity, so far:
{{actionSummary}}
```
Would produce:
> A summary of actions processed for this activity, so far:
>
> * approve - ✘ - Item is already approved??
> * lock - ✓
> * modnote - ✓ - (SOLID_CONTRIBUTOR) User is good
### Individual
Individual **Actions** can be accessed using the name of the action, **lower-cased, with all spaces/dashes/underscores.** Example:
```
User was banned for {{actions.exampleban.duration}} for {{actions.exampleban.reason}}
```
Produces
> User was banned for 4 days for toxic behavior
# Quick Templating Tutorial
As a quick example for how you will most likely be using templating -- wrapping a variable in curly brackets, `{{variable}}`, will cause the variable value to be rendered instead of the brackets:
```
myVariable = 50;
myOtherVariable = "a text fragment"
template = "This is my template, the variable is {{myVariable}}, my other variable is {{myOtherVariable}}, and that's it!";
"This is my template, the variable is 50, my other variable is a text fragment, and that's it!";
```
**Note: When accessing an object or its properties you must use dot notation**
```
const item = {
aProperty: 'something',
anotherObject: {
bProperty: 'something else'
}
}
const content = "My content will render the property {{item.aProperty}} like this, and another nested property {{item.anotherObject.bProperty}} like this."
* [Count And Duration](#count-and-duration-examples)
* [Activity Types](#activity-type-examples)
* [Filter](#filter-examples)
# Overview
An **Activity Window** (`window`) is a group of properties that describe a **range** of [**Activities**](/docs/README.md#activity) to retrieve from Reddit and how to **filter** them.
The main components of an Activity Window:
* **Range** -- How many Activities ([`count`](#count)) or what time period ([`duration`](#duration)) of Activities to fetch
* **Type of Activities** -- When **fetching** from an Author's history, should it return overview (any Activities), just Submissions, or just Comments?
* **Filters** -- How the retrieved Activities should be [filtered](/docs/subreddit/components/README.md#filters) before returning them to a Rule
As an example, if you want to run a **Recent Activity Rule** to check if a user has had activity in /r/mealtimevideos you also need to define what range of activities you want to look at from that user's history.
## Lifecycle
Understanding the lifecycle for how CM uses the Activity Window to retrieve Activities is important to effectively using it.
```mermaid
graph TD
RangReq["Determine Range requirements (Parse `count` and `duration` values)"] --> EmptyList>Create empty Activities List]
EmptyList --> Fetch>"Fetch Activities chunk (1 API Call)"]
Fetch --> PreFilter>Invoke `pre` filter on Activities from chunk]
PreFilter --> Add[Add filtered chunk to Activities list]
Add --> CheckRange>Check Range Satisfaction]
CheckRange -->|`count` Range| CountReq[Are there `count` # of Activities?]
CheckRange -->|`duration` Range| DurationReq[Is the oldest Activity `duration` old?]
CheckRange -->|`count` and `duration` Range| CountDurReq[Is either `count` and/or `duration` satisfied?]
CountReq -->|No| Fetch
DurationReq -->|No| Fetch
CountDurReq -->|No| Fetch
CountReq -->|Yes| FetchDone
DurationReq -->|Yes| FetchDone
CountDurReq -->|Yes| FetchDone
FetchDone[Fetch Complete] --> PostFilter>Invoke `post` filter on all Activities]
PostFilter --> Return>Return Activities to Rule]
click PreFilter href "#pre-filter"
click PostFilter href "#post-filter"
click CountReq href "#count"
click DurationReq href "#duration"
click CountDurReq href "#specifying-range"
```
# Range
There are two types of values that can be used when defining a range:
## Count
This is the **number** of activities you want to retrieve. It's straightforward -- if you want to look at the last 100 activities for a user you can use `100` as the value.
### Shorthand
If **count** is the only property you want to define (leaving everything else as default) then `window` can be defined with just this value:
```yaml
window:70# retrieve 70 activities
```
****
Otherwise, use the `count` property on a full `window` object:
```yaml
window:
count:70# retrieve 70 activities
...
```
## Duration
A **duration of time** between which all activities will be retrieved. This is a **relative value** that calculates the actual range based on **the duration of time subtracted from when the rule is run.**
For example:
* Today is **July 15th**
* You define a duration of **10 days**
Then the range of activities to be retrieved will be between **July 5th and July 15th** (10 days).
### Shorthand
If a Duration string is the only property you want to define (leaving everything else as default) then `window` can be defined with just this value:
```yaml
window:'9 days'# retrieve last 9 days of activities
```
Otherwise, use the `duration` property on a full `window` object:
```yaml
window:
duration:'9 days'
...
```
### Duration Types
The value used to define the duration can be **one of these two types**:
#### Duration String (recommended)
A string consisting of
* A [Dayjs unit of time](https://day.js.org/docs/en/durations/creating#list-of-all-available-units)
* The value of that unit of time
Examples:
*`9 days`
*`14 hours`
*`80 seconds`
You can ensure your string is valid by testing it [here.](https://regexr.com/61em3)
##### As ISO 8601 duration string
If you're a real nerd you can also use a [standard duration](https://en.wikipedia.org/wiki/ISO_8601#Durations) string.
Examples
*`PT15M` (15 minutes)
Ensure your string is valid by testing it [here.](https://regexr.com/61em9)
#### Duration Object
If you need to specify multiple units of time for your duration you can instead provide a [Dayjs duration **object**](https://day.js.org/docs/en/durations/creating#list-of-all-available-units) consisting of Dayjs unit-values.
```yaml
window:
duration:
days:4
hours:6
minutes:20
```
## Specifying Range
You may use **one or both range properties.**
If both range properties are specified then the value `satisfyOn` determines how the final range is determined
### Using `satisfyOn: any` (default)
If **any** then Activities will be retrieved until one of the range properties is met, **whichever occurs first.**
```yaml
window:
count:80
duration:90days
satisfyOn:any
```
Activities are retrieved in chunks of 100 (or `count`, whichever is smaller)
* If 90 days of activities returns only 40 activities => returns 40 activities
* If 80 activities is only 20 days of range => 80 activities
### Using `satisfyOn: all`
If **all** then both ranges must be satisfied. Effectively, whichever range produces the most Activities will be the one that is used.
```yaml
window:
count:100
duration:90days
satisfyOn:all
```
Activities are retrieved in chunks of 100 (or `count`, whichever is smaller)
* If at 90 days of activities => 40 activities retrieved
* continue retrieving results until 100 activities
* so range is >90 days of activities
* If at 100 activities => 20 days of activities retrieved
* continue retrieving results until 90 days of range
* so results in >100 activities
# Types of Activities
When retrieving an Author's history the window can specify if it should return all Activities (overview), just Comments, or just Submissions, using the `fetch` property:
```yaml
window:
# defaults to 'overview'
fetch: 'submission' # must be one of:overview, comment, submission
```
# Filters
Activity Window can also specify [Item and Subreddit Filters](/docs/subreddit/components/README.md#filters) to filter the Activities retrieved from Reddit before they are returned to a Rule.
Activities can be filtered **during** (`pre`) retrieval or **after** (`post`) retrieval. **When**, during the window **lifecycle**, the Activities are filtered can change the set of Activities returned to a Rule drastically.
## Filter Properties
Regardless of when you are filtering Activities the shape of the filter is the same. Filter properties:
*`subreddits` -- A [Filter Shape](/docs/subreddit/components/README.md#filter-shapes) for filtering by the [Subreddit Criteria](/docs/subreddit/components/README.md#subreddit-filter) of each Activity
*`submissionState` -- A [Filter Shape](/docs/subreddit/components/README.md#filter-shapes) for [Submission Criteria](/docs/subreddit/components/README.md#item-filter). Will run only if filtering a Submission.
*`commentState` -- A [Filter Shape](/docs/subreddit/components/README.md#filter-shapes) for [Comment Criteria](/docs/subreddit/components/README.md#item-filter). Will run only if filtering a Comment.
*`activityState` -- A [Filter Shape](/docs/subreddit/components/README.md#filter-shapes) for either [Submission or Comment Criteria](/docs/subreddit/components/README.md#item-filter). Will run only if `submissionState` or `commentState` is not defined for their respective Activity types.
In this example the filter only returns Activities:
* With a subreddit that
* Is from the subreddit /r/mealtimevideos OR
* Has a name that matches the regex `/ask.*/i` (starts with `ask`) OR
* Is marked as NSFW
* AND if the Activity is a Submission:
* must be self text
* AND if the Activity is a Comment (because `activityState` is defined and `commentState` is not):
* Comment is NOT removed
```yaml
subreddits:
include:
- 'mealtimevideos'
- '/ask.*/i'
- over18:true
submissionState:
include:
- is_self:true
activityState:
exclude:
- removed:false
```
## Filter Lifecycle
Filters can be independently specified for two different "locations" during the window lifecycle using the `filterOn` property.
### `pre` Filter
`pre` will filter Activities from **each API call, before** they are added to the full list of retrieved Activities and, most importantly, **before CM checks if Range conditions have been satisfied.** See the [Lifecycle Diagram](#lifecycle).
This is useful when you want the Range conditions to be applied to an "already filtered" list of Activities -- as opposed to afterwards (like `post`).
#### `max` restriction
However, `pre` introduces the possibility of **near impossible range conditions.**
For example, if you want 200 activities from a subreddit a user has never created activities in then CM would fetch the user's **entire history** before finishing since each chunk of Activities would be filtered to 0.
To prevent this scenario all `pre` filters must also specify a `max` [range](#range) that tell CM:
> if X range of **non-filtered** Activities is reached stop immediately
#### `pre` Example
Let's follow an example scenario where you want the last 200 activities a user has in /r/mealtimevideos
```yaml
window:
count:200
filterOn:
pre:
subreddits:
include:
- mealtimevideos
max:400
```
* CM fetches the first chunk of 100 Activities (100 total unfiltered)
*`pre` Filter finds 70 of those 100 are from /r/mealtimevideos => Total Filtered Activities 70
* CM Checks range condition => 70 is less than 200
* CM Checks max condition => 100 unfiltered is less than 400
* CM fetches second chunk of 100 Activities (200 total unfiltered)
*`pre` Filter finds another 70 of those 100 are from /r/mealtimevideos => Total Filtered Activities 140
* CM Checks range condition => 140 is less than 200
* CM Checks max condition => 200 unfiltered is less than 400
* CM fetches third chunk of 100 activities (300 total unfiltered)
*`pre` Filter finds 90 of those 100 are from /r/mealtimevideos => Total Filtered Activities 230
* CM checks range condition => 230 is more than 200
* CM Checks max condition => 300 unfiltered is less than 400
**CM is done fetching activities with 230 Filtered Activities**
Now let's look at the same scenario but `max` is hit:
* CM fetches the first chunk of 100 Activities (100 total unfiltered)
*`pre` Filter finds 10 of those 100 are from /r/mealtimevideos => Total Filtered Activities 10
* CM Checks range condition => 10 is less than 200
* CM Checks max condition => 100 unfiltered is less than 400
* CM fetches second chunk of 100 Activities (200 total unfiltered)
*`pre` Filter finds another 15 of those 100 are from /r/mealtimevideos => Total Filtered Activities 25
* CM Checks range condition => 25 is less than 200
* CM Checks max condition => 200 unfiltered is less than 400
* CM fetches third chunk of 100 activities (300 total unfiltered)
*`pre` Filter finds 5 of those 100 are from /r/mealtimevideos => Total Filtered Activities 30
* CM checks range condition => 30 is less than 200
* CM Checks max condition => 300 unfiltered is less than 400
* CM fetches fourth chunk of 100 activities (400 total unfiltered)
*`pre` Filter finds 0 of those 100 are from /r/mealtimevideos => Total Filtered Activities 30
* CM checks range condition => 30 is less than 200
* CM Checks max condition => 400 unfiltered is NOT less than 400
* CM stops fetching due to max condition hit
**CM is done fetching activities with 30 Filtered Activities**
### `post` Filter
`post` will filter the full list of Activities **after range conditions are satisfied**. This also means it receives the list of Activities filtered by the [`pre` filter](#pre-filter), if one is defined.
The list returned by `post` is what the Rule receives.
#### Example
Let's follow an example scenario where you want to get the last 200 activities from a User's history **and then** return any of those 200 that were made in /r/mealtimevideos -- contrast this wording to the [`pre` example](#pre-example)
```yaml
window:
count:200
filterOn:
post:
subreddits:
include:
- mealtimevideos
```
* CM fetches the first chunk of 100 Activities (100 total unfiltered)
* CM checks range condition => 100 is less than 200
* CM fetches the second chunk of 100 Activities (200 total unfiltered)
* CM checks range condition => 200 is equal to 200
* CM is done fetching activities with 200 unfiltered Activities
*`post` filter finds 10 of those 200 are from /r/mealtimevideos
* CM returns 10 Activities to the Rule
# More Examples
### Count Examples
#### Get last 100 activities in a User's history
```yaml
window:100
```
or
```yaml
window:
count:100
```
### Duration Examples
#### Get last 2 weeks of a User's history
```yaml
window:'2 weeks'
```
or
```yaml
window:
duration:'2 weeks'
```
#### Get last 24 hours and 30 minutes of User's history
```yaml
window:
duration:
hours:24
minutes:30
```
### Count and Duration Examples
#### Get EITHER last 6 months or last 200 activities of User's history
Whichever is [satisifed first](#using-satisfyon-any-default)
```yaml
window:
count:200
duration:'6 months'
```
#### Get AT LEAST the last 6 months and last 200 activities of User's history
Both must be [satisifed](#using-satisfyon-all)
```yaml
window:
count:200
duration:'6 months'
satisfyOn:'all'
```
### Activity Type Examples
#### Get the last 200 submissions from User's history
```yaml
window:
count:200
fetch:'submission'
```
#### Get the last 200 comments from User's history
```yaml
window:
count:200
fetch:'comment'
```
#### Get the last 200 activities (submissions or comments) from User's history
```yaml
window:
count:200
fetch:'overview'# or do not include 'fetch' at all, this is the default
```
### Filter Examples
#### Get the all activities from NSFW subreddits in the User's last 200 activities
```yaml
window:
count:200
filterOn:
post:
subreddits:
include:
- over18:true
```
#### Get the all comments from NSFW subreddits where user is OP, in the User's last 200 activities
```yaml
window:
count:200
fetch:'comment'
filterOn:
post:
subreddits:
include:
- over18:true
commentState:
include:
- op:true
```
#### Get the last 200 self-text submissions from a User's history and return any from ask* subreddits
Context Mod's behavior after a **Check** has been processed can be configured by a user. This allows a subreddit to control exactly what Runs/Checks will be processed based on the outcome (triggered or not) of a Check.
When a Check is finished processing it can be in one of two states:
* **Triggered** -- The **Rules** defined in the Check were **triggered** which caused the **Actions** for the Check to be run
* **Failure** -- The **Rules** defined in the check were **not triggered**, based on the conditions that were set (either from the [Check condition](/docs/README.md#Checks) or [Rule Sets](/docs/subreddit/components/advancedConcepts/README.md#Rule-Sets)), and no **Actions** were run
The behavior CM follows is based on which state it is in. The behavior can be specified **by one or both** of these **state properties** on the Check configuration:
*`postTrigger` -- Specifies what behavior to take when the check is **triggered**
*`postFail` -- Specifies what behavior to take when the check is **not triggered**
## Behavior
There are **four** behaviors CM can take. Both/either **state properties** can be defined with **any behavior.**
### Next
The **Next** behavior tells CM to continue to whatever comes *after the Check that was just processed.* This could be another Check or, if this is the last Check in a Run, the next Run.
NOTE: `next` is the **default behavior** for the `postFail` state
Example
```yaml
- name:MyCheck
# ...
postFail:next# if Check is not triggered then CM will start processing AnotherCheck
- name:AnotherCheck
# ...
```
### Next Run
The **Next Run** behavior tells CM to **skip all remaining Checks in the current Run and start processing the next Run in order.**
NOTE: `nextRun` is the **default behavior** for the `postTrigger` state
Example
```yaml
runs:
- name:MyFirstRun
checks:
- name:MyCheck
# ...
postTrigger:nextRun# if Check is triggered then CM will SKIP mySecondCheck and instead start processing MySecondRun
- name:MySecondCheck
# ...
- name:MySecondRun
checks:
- name:FooCheck
# ...
```
### Stop
The **Stop** behavior tells CM to **stop processing the Activity entirely.** This means all remaining Checks and Runs will not be processed.
Example
```yaml
runs:
- name:MyFirstRun
checks:
- name:MyCheck
# ...
postTrigger:stop# if Check is triggered CM will NOT process MySecondCheck OR MySecondRun. The activity is "done" being processed at this point
- name:MySecondCheck
# ...
- name:MySecondRun
checks:
- name:FooCheck
# ...
```
### Goto
The **Goto** behavior is an **advanced** behavior that allows you to specify that CM should "jump to" a specific place in your configuration, regardless of order/location, and continue processing the Activity from there. It can be used to do things like:
* create a loop/iteration to have CM re-process the Activity on an earlier executed part of your configuration because a later part modified the Activity (flaired, etc...)
* use a Check as a simplified *switch statement*
**Goto should be use with care.** If you do not fully understand how this mechanism works you should avoid using it as **most** behaviors can be accomplished using the other behaviors.
As an additional protection **goto depth is limited to 1 by default** which means if a `goto` would be executed more than once during an Activity's lifecycle CM will automatically stop processing that Activity. The `maxGotoDepth` can be raised by the [**Bot Operator**](/docs/operator/gettingStarted.md) per subreddit.
#### Goto Syntax
Location to "jump to" can be specified as:
* **Run** -- `goto:myRunName`
* **Check inside a different Run** -- `goto:myRunName.aCheckInsideTheRun`
* **Check inside the current Run** -- `goto:.myCheck`
Example
```yaml
runs:
- name:MyFirstRun
checks:
- name:FirstCheck
# ...
- name:MyCheck
# ...
postTrigger:'goto:MyThirdRun'# jump to the run MyThirdRun
postFail:'goto:MySecondRun.BuzzCheck'# jump to the Check BuzzCheck inside the Run MySecondRun
- name:MySecondRun
checks:
- name:FooCheck
# ...
- name:BuzzCheck
# ...
- name:MyThirdRun
checks:
- name:BarCheck
# ...
```
# Default Behaviors
It is **not required** to define post-Check behavior. CM uses sane defaults to mimic the behavior of automoderator as well as what is "intuitive" when reading a configuration -- that logic flows from top-to-bottom in the order it was defined. For each Check like this:
```yaml
- name:MyCheck
kind:comment
rules:
# ...
actions:
# ...
```
`postTrigger` and `postFail` have default behaviors (mentioned in the sections above) that make the Check end up working like this:
```yaml
- name:MyCheck
kind:comment
rules:
# ...
actions:
# ...
postTrigger:nextRun# check is triggered and actions were performed, skip remaining checks and go to the next Run
postFail:next# check is not triggered and no actions performed, continue to the next check in this Run
```
**So if you are fine with all Checks running in order until one triggered there is no need to define post-Check behaviors at all.**
## Defining Default Behaviors
Defining `postTrigger` and/or `postFail` on a **Run** will set the default behavior for any **Checks** in the Run that **do not have an explicit behavior set.**
```yaml
runs:
- name:MyFirstRun
postTrigger:stop# all Checks without postTrigger defined will have 'stop' as their behavior
checks:
- name:FooCheck# postTrigger is 'stop' since it is not defined
# ...
- name:BarCheck
# ...
postTrigger:next# overrides default behavior
```
# Examples
One **Run** with **default behavior** (no post-Check behavior explicitly defined)
# Runs inherit the same filters as checks/rules/actions
# If these filters fail the Run is skipped and CM processes the next run in order
# authorIs:
# itemIs:
# Set the default behavior for check trigger/fail
# postTrigger:
# postFail:
# Defaults can also be set for check authorIs/itemIs
# same as at operator/subreddit level - any defined here will override "higher" defaults
# filterCriteriaDefaults:
checks:
- name:goodUserFlair
description:flair user if they have decent history in sub
kind:submission
authorIs:
exclude:
- flairText:'Good User'
rules:
- kind:recentActivity
thresholds:
- threshold:'> 5'
karma:'> 10'
subreddits:
- mySubreddit
actions:
- kind:userflair
text:'Good User'
# post-behavior after a check has run. Either the check is TRIGGERED or FAIL
# there are 4 possible behaviors for each post-behavior type:
#
# 'next' => Continue to next check in order
# 'nextRun' => Exit the current Run (skip all remaining Checks) and go to the next Run in order
# 'stop' => Exit the current Run and finish activity processing immediately (skip all remaining Runs)
# 'goto:run[.check]' => Specify a run[.check] to jump to. This can be anywhere in your config. CM will continue to process in order from the specified point.
#
# GOTO syntax --
# 'goto:normalFilters' => go to run "normalFilters"
# 'goto:normalFilters.myCheck' => go to run "normalFilters" and start at check "myCheck"
# 'goto:.goodUserFlair' => go to check 'goodUserFlair' IN THE SAME RUN currently processing
#
# this means if the check triggers then continue to 'good submission flair'
postTrigger:next# default is 'nextRun'
# postFail: # default is 'next'
- name:good submission flair
description:flair submission if from good user
kind:submission
authorIs:
include:
- flairText:'Good User'
actions:
- kind:flair
text:'Trusted Source'
- kind:approve
# this means if the check is triggered then stop processing the activity entirely
postTrigger:stop
- name:Determine Suspect
checks:
- name:is suspect
kind:submission
rules:
- kind:recentActivity
thresholds:
- subreddits:
- over_18:true
actions:
# do some actions
# if check is triggered then go to run 'suspectFilters'
postTrigger:'goto:suspectFilters'
# if check is not triggered then go to run 'normalFilters'
"description": "Remove submission because author has self-promo >10% and posted in karma subs recently",
"kind": "submission",
"rules": [
// named rules can be referenced at any point in the configuration (where they occur does not matter)
// and can be used in any Check
// Note: rules do not transfer between subreddit configurations
"freekarmasub",
{
"name": "attr10all",
"kind": "attribution",
"criteria": [
{
"threshold": "> 10%",
"window": "90 days"
},
{
"threshold": "> 10%",
"window": 100
}
],
}
],
"actions": [
{
"kind": "remove"
},
{
"kind": "comment",
"content": "Your submission was removed because you are over reddit's threshold for self-promotion and recently posted this content in a karma sub"
}
]
},
{
"name": "Free Karma On Submission Alert",
"description": "Check if author has posted this submission in 'freekarma' subreddits",
"kind": "submission",
"rules": [
{
// rules can be re-used throughout a configuration by referencing them by name
//
// The rule name itself can only contain spaces, hyphens and underscores
// The value used to reference it will have all of these removed, and lower-cased
//
// so to reference this rule use the value 'freekarmasub'
"name": "Free_Karma-SUB",
"kind": "recentActivity",
"lookAt": "submissions",
"useSubmissionAsReference":true,
"thresholds": [
{
"threshold": ">= 1",
"subreddits": [
"DeFreeKarma",
"FreeKarma4U",
"FreeKarma4You",
"upvote"
]
}
],
"window": "7 days"
}
],
"actions": [
{
"kind": "report",
"content": "Submission posted {{rules.freekarmasub.totalCount}} times in karma {{rules.freekarmasub.subCount}} subs over {{rules.freekarmasub.window}}: {{rules.freekarmasub.subSummary}}"
The **Attribution** rule will aggregate an Author's content Attribution (youtube channels, twitter, website domains, etc.) and can check on their totals or percentages of all Activities over a time period:
* Total # of attributions
* As percentage of all Activity or only Submissions
* Look at all domains or only media (youtube, vimeo, etc.)
* Include self posts (by reddit domain) or not
Consult the [schema](https://json-schema.app/view/%23/%23%2Fdefinitions%2FCheckJson/%23%2Fdefinitions%2FAttributionJSONConfig?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) for a complete reference of the rule's properties.
### Examples
* Self Promotion as percentage of all Activities [YAML](/docs/subreddit/componentscomponents/attribution/redditSelfPromoAll.yaml) | [JSON](/docs/subreddit/componentscomponents/attribution/redditSelfPromoAll.json5) - Check if Author is submitting much more than they comment.
* Self Promotion as percentage of Submissions [YAML](/docs/subreddit/components/attribution/redditSelfPromoSubmissionsOnly.yaml) | [JSON](/docs/examplesm/attribution/redditSelfPromoSubmissionsOnly.json5) - Check if any of Author's aggregated submission origins are >10% of their submissions
The **Author** rule triggers if any [AuthorCriteria](https://json-schema.app/view/%23%2Fdefinitions%2FAuthorCriteria?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) from a list are either **included** or **excluded**, depending on which property you put them in.
**AuthorCriteria** that can be checked:
* name (u/userName)
* author's subreddit flair text
* author's subreddit flair css
* author's subreddit mod status
* [Toolbox User Notes](/docs/subreddit/componentscomponents/userNotes)
The Author **Rule** is best used in conjunction with other Rules to short-circuit a Check based on who the Author is. It is easier to use a Rule to do this then to write **author filters** for every Rule (and makes Rules more re-useable).
Consult the [schema](https://json-schema.app/view/%23%2Fdefinitions%2FAuthorRuleJSONConfig?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) for a complete reference of the rule's properties.
### Examples
* Basic examples
* Flair new user Submission [YAML](/docs/subreddit/componentscomponents/author/flairNewUserSubmission.yaml) | [JSON](/docs/subreddit/componentscomponents/author/flairNewUserSubmission.json5) - If the Author does not have the `vet` flair then flair the Submission with `New User`
* Flair vetted user Submission [YAML](/docs/subreddit/componentscomponents/author/flairNewUserSubmission.yaml) | [JSON](/docs/subreddit/componentscomponents/author/flairNewUserSubmission.json5) - If the Author does have the `vet` flair then flair the Submission with `Vetted`
* Used with other Rules
* Ignore vetted user [YAML](/docs/subreddit/componentscomponents/author/flairNewUserSubmission.yaml) | [JSON](/docs/subreddit/componentscomponents/author/flairNewUserSubmission.json5) - Short-circuit the Check if the Author has the `vet` flair
## Filter
All **Rules** and **Checks** have an optional `authorIs` property that takes an [AuthorOptions](https://json-schema.app/view/%23%2Fdefinitions%2FAuthorOptions?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) object.
**This property works the same as the Author Rule except that:**
* On **Rules** if all criteria fail the Rule is **skipped.**
* If a Rule is skipped **it does not fail or pass** and so does not affect the outcome of the Check.
* However, if all Rules on a Check are skipped the Check will fail.
* On **Checks** if all criteria fail the Check **fails**.
### Examples
* Skip recent activity check based on author [YAML](/docs/subreddit/componentscomponents/author/authorFilter.yaml) | [JSON](/docs/subreddit/componentscomponents/author/authorFilter.json5) - Skip a Recent Activity check for a set of subreddits if the Author of the Submission has any set of flairs.
## Flair users and submissions
Flair users and submissions based on certain keywords from submitter's profile.
Consult [User Flair schema](https://json-schema.app/view/%23%2Fdefinitions%2FUserFlairActionJson?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) and [Submission Flair schema](https://json-schema.app/view/%23%2Fdefinitions%2FFlairActionJson?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) for a complete reference of the rule's properties.
### Examples
* OnlyFans submissions [YAML](/docs/subreddit/componentscomponents/author/onlyfansFlair.yaml) | [JSON](/docs/subreddit/componentscomponents/author/onlyfansFlair.json5) - Check whether submitter has typical OF keywords in their profile and flair both author + submission accordingly.
The **History** rule can check an Author's submission/comment statistics over a time period:
* Submission total or percentage of All Activity
* Comment total or percentage of all Activity
* Comments made as OP (commented in their own Submission) total or percentage of all Comments
* Ratio of activities against another window of activities
Consult the [schema](https://json-schema.app/view/%23%2Fdefinitions%2FHistoryJSONConfig?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) for a complete reference of the rule's properties.
## Ratio
Use the `ratio` property in Criteria to test the [number of activities](/docs/subreddit/activitiesWindow.md) found in the parent criteria against the number of activities from _another_ [activity window](/docs/subreddit/activitiesWindow.md) defined in the ratio.
Example:
```yaml
- kind:history
criteria:
# "parent" criteria, returns all activities, in the last 100 from user's history, that occurred in r/mealtimevideos
- window:
count:100
filterOn:
post:
subreddits:
include:
- mealtimevideos
ratio:
# "ratio" criteria, returns all activities, in the last 100 from user's history, that occurred in r/redditdev
window:
count:100
filterOn:
post:
subreddits:
include:
- redditdev
# test (number of parent criteria activities) / (number of ratio critieria activities)
threshold:'> 1.2'
```
`threshold` may be a number or percentage `(number * 100)`
* EX `> 1.2` => There are 1.2 activities from parent criteria for every 1 ratio activities
* EX `<= 75%` => There are equal to or less than 0.75 activities from parent criteria for every 1 ratio activities
### Examples
* Low Comment Engagement [YAML](/docs/subreddit/componentscomponents/history/lowEngagement.yaml) | [JSON](/docs/subreddit/componentscomponents/history/lowEngagement.json5) - Check if Author is submitting much more than they comment.
* OP Comment Engagement [YAML](/docs/subreddit/componentscomponents/history/opOnlyEngagement.yaml) | [JSON](/docs/subreddit/componentscomponents/history/opOnlyEngagement.json5) - Check if Author is mostly engaging only in their own content
[moderatehatespeech.com](https://moderatehatespeech.com/) (MHS) is a [non-profit initiative](https://moderatehatespeech.com/about/) to identify and fight toxic and hateful content online using programmatic technology such as machine learning models.
They offer a [toxic content prediction model](https://moderatehatespeech.com/framework/) specifically trained on and for [reddit content](https://www.reddit.com/r/redditdev/comments/xdscbo/updated_bot_backed_by_moderationoriented_ml_for/) as well as partnering [directly with subreddits.](https://moderatehatespeech.com/research/subreddit-program/).
Context Mod leverages their [API](https://moderatehatespeech.com/docs/) for toxic content predictions in the **MHS Rule.**
The **MHS Rule** sends an Activity's content (title or body) to MHS which returns a prediction on whether the content is toxic and actionable by a moderator.
## MHS Predictions
MHS's toxic content predictions return two indicators about the content it analyzed. Both are available as test conditions in ContextMod.
### Flagged
MHS returns a straight "Toxic or Normal" **flag** based on how it classifies the content.
Example
*`Normal` - "I love those pineapples"
*`Toxic` - "why are we having all these people from shithole countries coming here"
### Confidence
MHS returns how **confident** it is of the flag classification on a scale of 0 to 100.
Example
"why are we having all these people from shithole countries coming here"
* Flag = `Toxic`
* Confidence = `97.12` -> The model is 97% confident the content is `Toxic`
# Usage
**An MHS Api Key is required to use this Rule**. An API Key can be acquired, for free, by creating an account at [moderatehatespeech.com](https://moderatehatespeech.com).
The Key can be provided by the bot's Operator in the [bot config credentials](https://json-schema.app/view/%23/%23%2Fdefinitions%2FBotInstanceJsonConfig/%23%2Fdefinitions%2FBotCredentialsJsonConfig?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fedge%2Fsrc%2FSchema%2FOperatorConfig.json) or in the subreddit's config in the top-level `credentials` property like this:
```yaml
credentials:
mhs:
apiKey:'myMHSApiKey'
# the rest of your config below
polling:
# ...
runs:
# ...
```
### Minimal/Default Config
ContextMod provides a reasonable default configuration for the MHS Rule if you do not wish to configure it yourself. The default configuration will trigger the rule if the MHS prediction:
| `flagged` | boolean | Test whether content is flagged as toxic (true) or normal (false) | `true` |
| `confidence` | string | Comparison against a number 0 to 100 representing how confident MHS is in the prediction | `>= 90` |
| `testOn` | array | Which parts of the Activity to send to MHS. Options: `title` and/or `body` | `body` |
Example
```yaml
rules:
- kind:mhs
criteria:
flagged:true# triggers if MHs flags the content as toxic AND
confidence:'> 66'# MHS is 66% or more confident in its prediction
testOn:# send the body of the activity to the MHS prediction service
- body
```
#### Historical Matching
Like the [Sentiment](/docs/subreddit/components/sentiment#historical) and [Regex](/docs/subreddit/components/regex#historical) rules CM can also use MHS predictions to check content from the Author's history.
Example
```yaml
rules:
- kind:mhs
# ...same config as above but can include below...
historical:
mustMatchCurrent:true# if true then CM will not check author's history unless current Activity matches MHS prediction criteria
totalMatching:'> 1'# comparison for how many activities in history must match to trigger the rule
window:10# specify the range of activities to check in author's history
criteria:#... if specified, overrides parent-level criteria
```
# Examples
Report if MHS flags as toxic
```yaml
rules:
- kind:mhs
actions:
- kind:report
content:'MHS flagged => {{rules.mhs.summary}}'
```
Report if MHS flags as toxic with 95% confidence
```yaml
rules:
- kind:mhs
confidence:'>= 95'
actions:
- kind:report
content:'MHS flagged => {{rules.mhs.summary}}'
```
Report if MHS flags as toxic and at least 3 recent activities in last 10 from author's history are also toxic
```yaml
rules:
- kind:mhs
historical:
window:10
mustMatchCurrent:true
totalMatching:'>= 3'
actions:
- kind:report
content:'MHS flagged => {{rules.mhs.summary}}'
```
Approve if MHS flags as NOT toxic with 95% confidence
[Mod Notes](https://www.reddit.com/r/modnews/comments/t8vafc/announcing_mod_notes/) is a feature for New Reddit that allow moderators to add short, categorizable notes to Users of their subreddit, optionally associating te note with a submission/comment the User made. They are inspired by [Toolbox User Notes](https://www.reddit.com/r/toolbox/wiki/docs/usernotes) which are also [supported by ContextMod.](/docs/subreddit/components/userNotes) Reddit's **Mod Notes** also combine [Moderation Log](https://mods.reddithelp.com/hc/en-us/articles/360022402312-Moderation-Log) actions (**Mod Actions**) for the selected User alongside moderator notes, enabling a full "overview" of moderator interactions with a User in their subreddit.
ContextMod supports adding **Mod Notes** to an Author using an [Action](/docs/subreddit/components/README.md#mod-note) and using **Mod Actions/Mod Notes** as a criteria in an [Author Filter](/docs/subreddit/components/README.md#author-filter)
*`type` must be one of the [valid note labels](https://www.reddit.com/dev/api#POST_api_mod_notes):
* BOT_BAN
* PERMA_BAN
* BAN
* ABUSE_WARNING
* SPAM_WARNING
* SPAM_WATCH
* SOLID_CONTRIBUTOR
* HELPFUL_USER
```yaml
actions:
- kind:modnote
type:SPAM_WATCH
content:'a note only mods can see message'# optional
referenceActivity:boolean# if true the Note will be linked to the Activity being processed
```
# Mod Action Filter
ContextMod can use **Mod Actions** (from moderation log) and **Mod Notes** in an [Author Filter](/docs/subreddit/components/README.md#author-filter).
## API Usage
Notes/Actions are **not** included in the data Reddit returns for either an Author or an Activity. This means that, in most cases, ContextMod is required to make **one additional API call to Reddit during Activity processing** if Notes/Actions as used as part of an **Author Filter**.
The impact of this additional call is greatest when the Author Filter is used as part of a **Comment Check** or running for **every Activity** such as part of a Run. Take this example:
No Mod Action filtering
* CM makes 1 api call to return new comments, find 10 new comments across 6 users
* Processing each comment, with no other filters, requires 0 additional calls
* At the end of processing 10 comments, CM has used a total of 1 api call.
Mod Action Filtering Used
* CM makes 1 api call to return new comments, find 10 new comments across 6 users
* Processing each comment, with a mod action filter, requires 1 additional api call per user
* At the end of processing 10 comments, CM has used a total of **7 api calls**
### When To Use?
In general,**do not** use Mod Actions in a Filter if:
* The filter is on a [**Comment** Check](/docs/subreddit/components/README.md#checks) and your subreddit has a high volume of Comments
* The filter is on a [Run](/docs/subreddit/components/README.md#runs) and your subreddit has a high volume of Activities
If you need Mod Notes-like functionality for a high volume subreddit consider using [Toolbox UserNotes](/docs/subreddit/components/userNotes) instead.
In general, **do** use Mod Actions in a Filter if:
* The filter is on a [**Submission** Check](/docs/subreddit/components/README.md#checks)
* The filter is part of an [Author **Rule**](/docs/subreddit/components/README.md#author) that is processed as **late as possible in the rule order for a Check**
* Your subreddit has a low volume of Activities (less than 100 combined submissions/comments in a 10 minute period, for example)
* The filter is on an Action
## Usage and Examples
Filter by Mod Actions/Notes on an Author Filter are done using the `modActions` property:
```yaml
age:'> 1 month'
# ...
modActions:
- ...
```
There two valid shapes for the Mod Action criteria: [ModLogCriteria](https://json-schema.app/view/%23%2Fdefinitions%2FModLogCriteria?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Freddit-context-bot%2Fedge%2Fsrc%2FSchema%2FApp.json) and [ModNoteCriteria](https://json-schema.app/view/%23%2Fdefinitions%2FModNoteCriteria?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Freddit-context-bot%2Fedge%2Fsrc%2FSchema%2FApp.json).
### ModLogCriteria
Used for filtering by **Moderation Log** actions *and/or general notes*.
*`activityType` -- Optional. If Mod Action is associated with an activity specify it here. A list or one of:
*`submission`
*`comment`
*`type` -- Optional. The type of Mod Log Action. A list or one of:
*`INVITE`
*`NOTE`
*`REMOVAL`
*`SPAM`
*`APPROVAL`
*`description` -- additional mod log details (string) to filter by -- not documented by reddit. Can be string or regex string-like `/.* test/i`
*`details` -- additional mod log details (string) to filter by -- not documented by reddit. Can be string or regex string-like `/.* test/i`
```yaml
activityType:submission
type:
- REMOVAL
- SPAM
search:total
count:'> 3 in 1 week'
```
### ModNoteCriteria
Inherits `activityType` from ModLogCriteria. If either of the below properties in included on the criteria then any other ModLogCriteria-specific properties are **ignored**.
*`note` -- the contents of the note to match against. Can be one of or a list of strings/regex string-like `/.* test/i`
*`noteType` -- If specified by the note, the note type (see [Mod Note Action](#mod-note-action) type). Can be one of or a list of strings/regex string-like `/.* test/i`
```yaml
noteType:SOLID_CONTRIBUTOR
search:total
count:'> 3 in 1 week'
```
### Examples
Author has more than 2 submission approvals in the last month
```yaml
type:APPROVAL
activityType:submission
search:total
count:'> 2 in 1 month'
```
Author has at least 1 BAN note
```yaml
noteType:BAN
search:total
count:'>= 1'
```
Author has at least 3 notes which include the words "self" and "promotion" in the last month
Given a list subreddit criteria, the **Recent Activity** rule finds Activities matching those criteria in the Author's history over [window](#activities-window) and then allows for comparing different facets of the results.
Subreddit criteria can be:
* names
* regular expression for names
* [Subreddit meta properties](https://json-schema.app/view/%23/%23%2Fdefinitions%2FSubmissionCheckJson/%23%2Fdefinitions%2FRecentActivityRuleJSONConfig/%23%2Fdefinitions%2FActivityThreshold/%23%2Fdefinitions%2FSubredditState?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Freddit-context-bot%2Fmaster%2Fsrc%2FSchema%2FApp.json) like NSFW, description, is user profile, is author's profile, etc...
Facets available to compare from analyzed history:
* number of activities found EX `> 3` => more than 3 activities found
* aggregated karma from activities EX `> 50` => more than 50 combined karma from found activities
* number of subreddits found EX `> 5` => more than 5 distinct subreddits matching subreddit criteria found
The above can also be expressed as a percentage instead of number IE "more than 10% of activities in author history come from subreddits matching criteria"
The search can also be modified in a number of ways:
* Filter found activities using an [Item Filter](#item)
* Only return activities that match the Activity from the Event being processed
* Using image detection (pixel or perceptual hash matching)
* Only return certain types of activities (only submission or only comments)
Consult the [schema](https://json-schema.app/view/%23%2Fdefinitions%2FRecentActivityRuleJSONConfig?url=https%3A%2F%2Fraw.githubusercontent.com%2FFoxxMD%2Fcontext-mod%2Fmaster%2Fsrc%2FSchema%2FApp.json) for a complete reference of the rule's properties.
### Examples
* Free Karma Subreddits [YAML](/docs/subreddit/componentscomponents/recentActivity/freeKarma.yaml) | [JSON](/docs/subreddit/componentscomponents/recentActivity/freeKarma.json5) - Check if the Author has recently posted in any "free karma" subreddits
* Submission in Free Karma Subreddits [YAML](/docs/subreddit/componentscomponents/recentActivity/freeKarmaOnSubmission.yaml) | [JSON](/docs/subreddit/componentscomponents/recentActivity/freeKarmaOnSubmission.json5) - Check if the Author has posted the Submission this check is running on in any "free karma" subreddits recently
"description": "Check if author has posted in 'freekarma' subreddits",
// check will run on a new submission in your subreddit and look at the Author of that submission
"kind": "submission",
"rules": [
{
"name": "freekarma",
"kind": "recentActivity",
"useSubmissionAsReference": false,
// when `lookAt` is not present this rule will look for submissions and comments
// lookAt: "submissions"
// lookAt: "comments"
"thresholds": [
{
// for all subreddits, if the number of activities (sub/comment) is equal to or greater than 1 then the rule is triggered
"threshold": ">= 1",
"subreddits": [
"DeFreeKarma",
"FreeKarma4U",
"FreeKarma4You",
"upvote"
]
}
],
// will look at all of the Author's activities in the last 7 days
"window": "7 days"
}
],
"actions": [
{
"kind": "report",
"content": "{{rules.freekarma.totalCount}} activities in karma {{rules.freekarma.subCount}} subs over {{rules.freekarma.window}}: {{rules.freekarma.subSummary}}"
"description": "Check if author has posted this submission in 'freekarma' subreddits",
// check will run on a new submission in your subreddit and look at the Author of that submission
"kind": "submission",
"rules": [
{
"name": "freekarmasub",
"kind": "recentActivity",
// rule will only look at Author's submissions in these subreddits
"lookAt": "submissions",
// rule will only look at Author's submissions in these subreddits that have the same content (link) as the submission this event was made on
// In simpler terms -- rule will only check to see if the same link the author just posted is also posted in these subreddits
"useSubmissionAsReference":true,
"thresholds": [
{
// for all subreddits, if the number of activities (sub/comment) is equal to or greater than 1 then the rule is triggered
"threshold": ">= 1",
"subreddits": [
"DeFreeKarma",
"FreeKarma4U",
"FreeKarma4You",
"upvote"
]
}
],
// look at all of the Author's submissions in the last 7 days
"window": "7 days"
}
],
"actions": [
{
"kind": "report",
"content": "Submission posted {{rules.freekarmasub.totalCount}} times in karma {{rules.freekarmasub.subCount}} subs over {{rules.freekarmasub.window}}: {{rules.freekarmasub.subSummary}}"
The **Regex** rule matches on text content from a comment or submission in the same way automod uses regex. The rule, however, provides additional functionality automod does not:
* Can set the **number** of matches that trigger the rule (`matchThreshold`)
Which can then be used in conjunction with a [`window`](https://github.com/FoxxMD/context-mod/blob/master/docs/activitiesWindow.md) to match against activities from the history of the Author of the Activity being checked (including the Activity being checked):
* Can set the **number of Activities** that meet the `matchThreshold` to trigger the rule (`activityMatchThreshold`)
* Can set the **number of total matches** across all Activities to trigger the rule (`totalMatchThreshold`)
* Can set the **type of Activities** to check (`lookAt`)
* When an Activity is a Submission can **specify which parts of the Submission to match against** IE title, body, and/or url (`testOn`)
### Examples
* Trigger if regex matches against the current activity - [YAML](/docs/subreddit/componentscomponents/regex/matchAnyCurrentActivity.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/matchAnyCurrentActivity.json5)
* Trigger if regex matches 5 times against the current activity - [YAML](/docs/subreddit/componentscomponents/regex/matchThresholdCurrentActivity.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/matchThresholdCurrentActivity.json5)
* Trigger if regex matches against any part of a Submission - [YAML](/docs/subreddit/componentscomponents/regex/matchSubmissionParts.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/matchSubmissionParts.json5)
* Trigger if regex matches any of Author's last 10 activities - [YAML](/docs/subreddit/componentscomponents/regex/matchHistoryActivity.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/matchHistoryActivity.json5)
* Trigger if regex matches at least 3 of Author's last 10 activities - [YAML](/docs/subreddit/componentscomponents/regex/matchActivityThresholdHistory.json5) | [JSON](/docs/subreddit/componentscomponents/regex/matchActivityThresholdHistory.json5)
* Trigger if there are 5 regex matches in the Author's last 10 activities - [YAML](/docs/subreddit/componentscomponents/regex/matchTotalHistoryActivity.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/matchTotalHistoryActivity.json5)
* Trigger if there are 5 regex matches in the Author's last 10 comments - [YAML](/docs/subreddit/componentscomponents/regex/matchSubsetHistoryActivity.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/matchSubsetHistoryActivity.json5)
* Remove comments that are spamming discord links - [YAML](/docs/subreddit/componentscomponents/regex/removeDiscordSpam.yaml) | [JSON](/docs/subreddit/componentscomponents/regex/removeDiscordSpam.json5)
* Differs from just using automod because this config can allow one-off/organic links from users who DO NOT spam discord links but will still remove the comment if the user is spamming them
# triggers if current activity has greater than 5 matches
matchThreshold:'> 5'
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.