`lpCompare()` in `quicklistCompare()` will call `lpGet()` again, which
would be a waste.
The change will result in a boost for all commands that use
`quicklistCompre()`, including `linsert`, `lpos` and `lrem`.
This PR adds a new section to the `INFO` command output, called
`keysizes`. This section provides detailed statistics on the
distribution of key sizes for each data type (strings, lists, sets,
hashes and zsets) within the dataset. The distribution is tracked using
a base-2 logarithmic histogram.
# Motivation
Currently, Redis lacks a built-in feature to track key sizes and item
sizes per data type at a granular level. Understanding the distribution
of key sizes is critical for monitoring memory usage and optimizing
performance, particularly in large datasets. This enhancement will allow
users to inspect the size distribution of keys directly from the `INFO`
command, assisting with performance analysis and capacity planning.
# Changes
New Section in `INFO` Command: A new section called `keysizes` has been
added to the `INFO` command output. This section reports a per-database,
per-type histogram of key sizes. It provides insights into how many keys
fall into specific size ranges (represented in powers of 2).
**Example output:**
```
127.0.0.1:6379> INFO keysizes
# Keysizes
db0_distrib_strings_sizes:1=19,2=655,512=100899,1K=31,2K=29,4K=23,8K=16,16K=3,32K=2
db0_distrib_lists_items:1=5784492,32=3558,64=1047,128=676,256=533,512=218,4K=1,8K=42
db0_distrib_sets_items:1=735564=50612,8=21462,64=1365,128=974,2K=292,4K=154,8K=89,
db0_distrib_hashes_items:2=1,4=544,32=141169,64=207329,128=4349,256=136226,1K=1
```
## Future Use Cases:
The key size distribution is collected per slot as well, laying the
groundwork for future enhancements related to Redis Cluster.
The crash happens whenever the user isn't accessible, for example, it
isn't set for the context (when it is temporary) or in some other cases
like `notifyKeyspaceEvent`. To properly check for the ACL compliance, we
need to get the user name and the user to invoke other APIs. However, it
is not possible if it crashes, and it is impossible to work that around
in the code since we don't know (and **shouldn't know**!) when it is
available and when it is not.
From 7.4, Redis allows `GET` options in cluster mode when the pattern maps to
the same slot as the key, but GET # pattern that represents key itself is missed.
This commit resolves it, bug report #13607.
---------
Co-authored-by: Yuan Wang <yuan.wang@redis.com>
Nowadays popcnt instruction is almost supported by X86 machine, which is
used to calculate "Hamming weight", it can bring much performance boost
in redis bitcount comand.
---------
Signed-off-by: hanhui365(hanhui@hygon.cn)
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: oranagra <oran@redislabs.com>
Co-authored-by: Nugine <nugine@foxmail.com>
After running test in local, there will be a file named
`.rediscli_history_test`, and it is not in `.gitignore` file, so this is
considered to have changed the code base. It is a little annoying, this
commit just clean up the temporary file.
We should delete `.rediscli_history_test` in the end since the second
server tests also write somethings into it, to make it corresponding, i
put `set ::env(REDISCLI_HISTFILE) ".rediscli_history_test"` at the
beginning.
Maybe we also can add this file into `.gitignore`
- Add a new 'EXPERIMENTAL' command flag, which causes the command
generator to skip over it and make the command to be unavailable for
execution
- Skip experimental tests by default
- Move the SFLUSH tests from the old framework to the new one
---------
Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>
1. `dbRandomKey`: excessive call to `dbFindExpires` (will always return
1 if `allvolatile` + anyway called inside `expireIfNeeded`
2. Add `deleteKeyAndPropagate` that is used by both expiry/eviction
3. Change the order of calls in `expireIfNeeded` to save redundant calls
to `keyIsExpired`
4. `expireIfNeeded`: move `OBJ_STATIC_REFCOUNT` to
`deleteKeyAndPropagate`
5. `performEvictions` now uses `deleteEvictedKeyAndPropagate`
6. active-expire: moved `postExecutionUnitOperations` inside
`activeExpireCycleTryExpire`
7. `activeExpireCycleTryExpire`: less indentation + expire a key if `now
== t`
8. rename `lazy_expire_disabled` to `allow_access_expired`
This PR introduces a new `SFLUSH` command to cluster mode that allows
partial flushing of nodes based on specified slot ranges. Current
implementation is designed to flush all slots of a shard, but future
extensions could allow for more granular flushing.
**Command Usage:**
`SFLUSH <start-slot> <end-slot> [<start-slot> <end-slot>]* [SYNC|ASYNC]`
This command removes all data from the specified slots, either
synchronously or asynchronously depending on the optional SYNC/ASYNC
argument.
**Functionality:**
Current imp of `SFLUSH` command verifies that the provided slot ranges
are valid and cover all of the node's slots before proceeding. If slots
are partially or incorrectly specified, the command will fail and return
an error, ensuring that all slots of a node must be fully covered for
the flush to proceed.
The function supports both synchronous (default) and asynchronous
flushing. In addition, if possible, SFLUSH SYNC will be run as blocking
ASYNC as an optimization.
This PR is based on valkey-io/valkey#829
Previously, ZUNION and ZUNIONSTORE commands used a temporary accumulator dict
and at the end copied it as-is to dstzset->dict. This PR removes accumulator and directly
stores into dstzset->dict, eliminating the extra copy.
Co-authored-by: Rayacoo zisong.cw@alibaba-inc.com
Test 1 - give more time for expiration
Test 2 - Evaluate expiration time boundaries [+1,+2] before setting expiration [+1]
Test 3 - Avoid race on test HFEs propagated to replica
The PR extends `RedisModule_OpenKey`'s flags to include
`REDISMODULE_OPEN_KEY_ACCESS_EXPIRED`, which allows to access expired
keys.
It also allows to access expired subkeys. Currently relevant only for
hash fields
and has its impact on `RM_HashGet` and `RM_Scan`.
This PR introduces the installation of the `musl`-based version of Rust,
in order to support alpine-based runtime environments (Rust is used by
[RedisJSON](https://github.com/RedisJSON/RedisJSON)).
Instead of adding runtime logic to decide which prefix/shared object to
use when doing the reply we can simply use an inline method to avoid runtime
overhead of condition checks, and also keep the code change small.
Preliminary data show improvements on commands that heavily rely on
bulk/mbulk replies (example of LRANGE).
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
Fixes#8825
We're using the fast_float library[1] in our (compiled-in)
floating-point fast_float_strtod implementation for faster and more
portable parsing of 64 decimal strings.
The single file fast_float.h is an amalgamation of the entire library,
which can be (re)generated with the amalgamate.py script (from the
fast_float repository) via the command:
```
python3 ./script/amalgamate.py --license=MIT > $REDIS_SRC/deps/fast_float/fast_float.h
```
[1]: https://github.com/fastfloat/fast_float
The used commit from fast_float library was the one from
https://github.com/fastfloat/fast_float/releases/tag/v3.10.1
---------
Co-authored-by: fcostaoliveira <filipe@redis.com>
Similar to #13530 , applied to HSCAN and ZSCAN in case of listpack
encoding.
**Preliminary benchmark results showcase an improvement of 108% on the
achievable ops/sec for ZSCAN and 65% for HSCAN**.
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
in https://github.com/redis/redis/pull/13519, when `eb` is empty,
`isRax` is not correctly initialized to 0, which can lead to `ebStop()`
potentially entering the wrong rax branch.
On SSCAN, in case of listpack and intset encoding we actually reply the
entire set, and always reply with the cursor 0.
For those cases, we don't need to accumulate the replies in a list and
can completely avoid the overhead of list appending and then iterating
over the list again -- meaning we do N iterations instead of 2N
iterations over the SET and save intermediate memory as well.
Preliminary benchmarks, `SSCAN set:100 0`, showcased an improvement of
60% as visible bellow on a SET with 100 string elements (listpack
encoded).
If the hash previously had HFEs (hash-fields with expiration) but later no longer
does, the key ref in the hash might become outdated after a MOVE, COPY,
RENAME or RESTORE operation. These commands maintain the key ref only
if HFEs are present. That is, we can only be sure that key ref is valid as long as the
hash has HFEs.
When a client in no-touch mode issues a TOUCH command on a key, the
key’s access time should be updated, but in scripts, and module's
RM_Call, it isn’t updated.
Command proc should be matched to the executing client, not the current
client.
Co-authored-by: Udi Ron <udi@speedb.io>
PR #13428 doesn't fully resolve an issue where corruption errors can
still occur on loading of cluster.nodes file - seen on upgrade where
there were no shard_ids (from old Redis), 7.2.5 loading generated new
random ones, and persisted them to the file before gossip/handshake
could propagate the correct ones (or some other nodes unreachable).
This results in a primary/replica having differing shard_id in the
cluster.nodes and then the server cannot startup - reports corruption.
This PR builds on #13428 by simply ignoring the replica's shard_id in
cluster.nodes (if it exists), and uses the replica's primary's shard_id.
Additional handling was necessary to cover the case where the replica
appears before the primary in cluster.nodes, where it will first use a
generated shard_id for the primary, and then correct after it loads the
primary cluster.nodes entry.
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
Found by @oranagra
Currently, when the size of dict becomes 1, we do not check whether
`delta` is positive or negative.
As a result, `non_empty_dicts` is still incremented when the size of
dict changes from 2 to 1.
We should only increment `non_empty_dicts` when `delta` is positive, as
this indicates the first time an element is inserted into the dict.
---------
Co-authored-by: oranagra <oran@redislabs.com>
This is a very easy optimization, that avoids duplicate computation of
the object length for LREM, LPOS, LINSERT na LINDEX.
We can see that sdslen takes 7.7% of the total CPU cycles of the
benchmarks.
Function Stack | CPU Time: Total | CPU Time: Self | Module | Function
(Full) | Source File | Start Address
-- | -- | -- | -- | -- | -- | --
listTypeEqual | 15.50% | 2.346s | redis-server | listTypeEqual |
t_list.c | 0x845dd
sdslen | 7.70% | 2.300s | redis-server | sdslen | sds.h | 0x845e4
Preliminary data showcases 4% improvement in the achievable ops/sec of
LPOS in string elements, and 2% in int elements.
A new BUILD_WITH_MODULES flag was added to the Makefile to control
building the module directory.
The new module directory includes a general Makefile that iterates
over each module, fetch a specific version, and build it.
Co-authored-by: YaacovHazan <yaacov.hazan@redislabs.com>
#13495 introduced a change to reply -LOADING while flushing existing db on a replica. Some of our tests are
sensitive to this change and do no expect -LOADING reply.
Fixing a couple of tests that fail time to time.
This PR is based on the commits from PR
https://github.com/valkey-io/valkey/pull/258,
https://github.com/valkey-io/valkey/pull/593,
https://github.com/valkey-io/valkey/pull/639
This PR optimizes client query buffer handling in Redis by introducing
a reusable query buffer that is used by default for client reads. This
reduces memory usage by ~20KB per client by avoiding allocations for
most clients using short (<16KB) complete commands. For larger or
partial commands, the client still gets its own private buffer.
The primary changes are:
* Adding a reusable query buffer `thread_shared_qb` that clients use by
default.
* Modifying client querybuf initialization and reset logic.
* Freeing idle client query buffers when empty to allow reuse of the
reusable query buffer.
* Master client query buffers are kept private as their contents need to
be preserved for replication stream.
* When nested commands is executed, only the first user uses the reuse
buffer, and subsequent users will still use the private buffer.
In addition to the memory savings, this change shows a 3% improvement in
latency and throughput when running with 1000 active clients.
The memory reduction may also help reduce the need to evict clients when
reaching max memory limit, as the query buffer is the main memory
consumer per client.
This PR is different from https://github.com/valkey-io/valkey/pull/258
1. When a client is in the mid of requiring a reused buffer and
returning it, regardless of whether the query buffer has changed
(expanded), we do not update the reused query buffer in the middle, but
return the reused query buffer (expanded or with data remaining) or
reset it at the end.
2. Adding a new thread variable `thread_shared_qb_used` to avoid
multiple clients requiring the reusable query buffer at the same time.
---------
Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Madelyn Olson <matolson@amazon.com>
Co-authored-by: Uri Yagelnik <uriy@amazon.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: oranagra <oran@redislabs.com>
After https://github.com/redis/redis/pull/13499, If the length set by
`addReplySetLen()` does not match the actual number of elements in the
reply, it will cause protocol broken and result in the client hanging.
RM_RdbLoad() disables AOF temporarily while loading RDB.
Later, it does not enable it back as it checks AOF state (disabled by then)
rather than AOF config parameter.
Added a change to restart AOF according to config parameter.
- Avoid addReplyLongLong (which converts back to string) the value we
already have as a robj, by using addReplyProto + addReply
- Avoid doing dbFind Twice for the same dictEntry on
INCR*/DECR*/SETRANGE/APPEND commands.
- Avoid multiple sdslen calls with the same input on setrangeCommand and
appendCommand
- Introduce setKeyWithDictEntry, which is like setKey(), but accepts an
optional dictEntry input: Avoids the second dictFind in SET command
---------
Co-authored-by: debing.sun <debing.sun@redis.com>
In #13279 (found by @filipecosta90), for custom lookups, we introduce a
comparison function for `lpFind()` to compare entry, but it also
introduces some overhead.
To avoid the overhead of function pointer calls:
1. Extract the lpFindCb() method into a lpFindCbInternal() method that
is easier to inline.
2. Use unlikely to annotate the comparison method, as can only success
once.
---------
Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
All the defrag allocations API expects to get a value and replace it, leaving the old value untouchable.
In some cases a value might be shared between multiple keys, in such cases we can not simply replace
it when the defrag callback is called.
To allow support such use cases, the PR adds two new API's to the defrag API:
1. `RM_DefragAllocRaw` - allocate memory base on a given size.
2. `RM_DefragFreeRaw` - Free the given pointer.
Those API's avoid using tcache so they operate just like `RM_DefragAlloc` but allows the user to split
the allocation and the memory free operations into two stages and control when those happen.
In addition the PR adds new API to allow the module to receive notifications when defrag start and end: `RM_RegisterDefragCallbacks`
Those callbacks are the same as `RM_RegisterDefragFunc` but promised to be called and the start
and the end of the defrag process.