% Generated by roxygen2: do not edit by hand % Please edit documentation in R/cache-disk.R \name{diskCache} \alias{diskCache} \title{Create a disk cache object} \usage{ diskCache(dir = NULL, max_size = 10 * 1024^2, max_age = Inf, max_n = Inf, evict = c("lru", "fifo"), destroy_on_finalize = NULL, missing = key_missing()) } \arguments{ \item{dir}{Directory to store files for the cache. If \code{NULL} (the default) it will create and use a temporary directory.} \item{max_size}{Maximum size of the cache, in bytes. If the cache exceeds this size, cached objects will be removed according to the value of the \code{evict}.} \item{max_age}{Maximum age of files in cache before they are evicted, in seconds.} \item{max_n}{Maximum number of objects in the cache. If the number of objects exceeds this value, then cached objects will be removed according to the value of \code{evict}.} \item{evict}{The eviction policy to use to decide which objects are removed when a cache pruning occurs. Currently, \code{"lru"} and \code{"fifo"} are supported.} \item{destroy_on_finalize}{If \code{TRUE}, then when the DiskCache object is garbage collected, the cache directory and all objects inside of it will be deleted from disk. If \code{FALSE}, it will do nothing when finalized. If \code{NULL} (the default), then the behavior depends on the value of \code{dir}: If \code{destroy_on_finalize=NULL} and \code{dir=NULL}, then a temporary directory will be created and used for the cache, and it will be deleted when the DiskCache is finalized. If \code{destroy_on_finalize=NULL} and \code{dir} is \emph{not} \code{NULL}, then the directory will not be deleted when the DiskCache is finalized. In short, when \code{destroy_on_finalize=NULL}, if the cache directory is automatically created, it will be automatically deleted, and if the cache directory is not automatically created, it will not be automatically deleted.} \item{missing}{A value to return, or a quoted expression to evaluate when \code{get()} is called but the key is not present in the cache. The default is a \code{\link{key_missing}} object. See section Missing keys for more information.} } \description{ A disk cache object is a key-value store that saves the values as files in a directory on disk. Objects can be stored and retrieved using the \code{get()} and \code{set()} methods. Objects are automatically pruned from the cache according to the parameters \code{max_size}, \code{max_age}, \code{max_n}, and \code{evict}. } \section{Missing keys}{ The \code{missing} parameter controls what happens when \code{get()} is called with a key that is not in the cache (a cache miss). The default behavior is to return a \code{\link{key_missing}} object. This is a \emph{sentinel value} representing a missing key. You can test if the returned value represents a missing key by using the \code{\link{is.key_missing}} function. You can also have \code{get()} return a different sentinel value, like \code{NULL}, or even throw an error on a cache miss. When the cache is created, you can supply a value for \code{missing}, which sets the default value to be returned for missing values. It can also be overridden when \code{get()} is called, by supplying a \code{missing} argument, as in \code{cache$get("mykey", missing = NULL)}. If your cache is configured so that \code{get()} returns a sentinel value to represent a cache miss, then \code{set} will also not allow you to store the sentinel value in the cache. It will throw an error if you attempt to do so. If \code{missing} is a quoted expression, then that expression will be evaluated each time \code{get()} encounters missing key. If the evaluation of the expression does not throw an error, then \code{get()} will return the resulting value. However, it is more common for the expression to throw an error. If an error is thrown, then \code{get()} will not return a value. For example, you could use \code{quote(stop("Missing key"))}. If you use this, the code that calls \code{get()} should be wrapped with \code{\link{tryCatch}()} to gracefully handle missing keys. } \section{Cache pruning}{ Cache pruning occurs each time \code{get()} and \code{set()} are called, or it can be invoked manually by calling \code{prune()}. If there are any objects that are older than \code{max_age}, they will be removed when a pruning occurs. The \code{max_size} and \code{max_n} parameters are applied to the cache as a whole, in contrast to \code{max_age}, which is applied to each object individually. If the number of objects in the cache exceeds \code{max_n}, then objects will be removed from the cache according to the eviction policy, which is set with the \code{evict} parameter. Objects will be removed so that the number of items is \code{max_n}. If the size of the objects in the cache exceeds \code{max_size}, then objects will be removed from the cache. Objects will be removed from the cache so that the total size remains under \code{max_size}. Note that the size is calculated using the size of the files, not the size of disk space used by the files -- these two values can differ because of files are stored in blocks on disk. For example, if the block size is 4096 bytes, then a file that is one byte in size will take 4096 bytes on disk. } \section{Eviction policies}{ If \code{max_n} or \code{max_size} are used, then objects will be removed from the cache according to an eviction policy. The available eviction policies are: \describe{ \item{\code{"lru"}}{ Least Recently Used. The least recently used objects will be removed. This uses the filesystem's atime property. Some filesystems do not support atime, or have a very low atime resolution. The DiskCache will check for atime support, and if the filesystem does not support atime, a warning will be issued and the "fifo" policy will be used instead. } \item{\code{"fifo"}}{ First-in-first-out. The oldest objects will be removed. } } } \section{Sharing among multiple processes}{ The directory for a DiskCache can be shared among multiple R processes. To do this, each R process should have a DiskCache object that uses the same directory. Each DiskCache will do pruning independently of the others, so if they have different pruning parameters, then one DiskCache may remove cached objects before another DiskCache would do so. When multiple processes share a cache directory, there are some potential race conditions. For example, if your code calls \code{exists(key)} to check if an object is in the cache, and then call \code{get(key)}, the object may be removed from the cache in between those two calls, and \code{get(key)} will throw an error. Instead of calling the two functions, it is better to simply call \code{get(key)}, and use \code{tryCatch()} to handle the error that is thrown if the object is not in the cache. This effectively tests for existence and gets the object in one operation. It is also possible for one processes to prune objects at the same time that another processes is trying to prune objects. If this happens, you may see a warning from \code{file.remove()} failing to remove a file that has already been deleted. } \section{Methods}{ A disk cache object has the following methods: \describe{ \item{\code{get(key, missing)}}{ Returns the value associated with \code{key}. If the key is not in the cache, then it returns the value specified by \code{missing}. The default value for \code{missing} when the DiskCache object is created, but it can be overridden when \code{get()} is called. } \item{\code{set(key, value)}}{ Stores the \code{key}-\code{value} pair in the cache. } \item{\code{exists(key)}}{ Returns \code{TRUE} if the cache contains the key, otherwise \code{FALSE}. } \item{\code{size()}}{ Returns the number of items currently in the cache. } \item{\code{keys()}}{ Returns a character vector of all keys currently in the cache. } \item{\code{reset()}}{ Clears all objects from the cache. } \item{\code{destroy()}}{ Clears all objects in the cache, and removes the cache directory from disk. } \item{\code{prune()}}{ Prunes the cache, using the parameters specified by \code{max_size}, \code{max_age}, \code{max_n}, and \code{evict}. } } }