mirror of
https://github.com/redis/redis.git
synced 2026-04-22 19:37:30 -04:00
Vector Sets deserialization was not designed to resist corrupted data, assuming that a good checksum would mean everything is fine. However Redis allows the user to specify extra protection via a specific configuration option. This commit makes the implementation more resistant, at the cost of some slowdown. This also fixes a serialization bug that is unrelated (and has no memory corruption effects) about the lack of the worst index / distance serialization, that could lower the quality of a graph after links are replaced. I'll address the serialization issues in a new PR that will focus on that aspect alone (already work in progress). The net result is that loading vector sets is, when the serialization of worst index/distance is missing (always, for now) 100% slower, that is 2 times the loading time we had before. Instead when the info will be added it will be just 10/15% slower, that is, just making the new sanity checks. It may be worth to export to modules if advanced sanity check if needed or not. Anyway most of the slowdown in this patch comes from having to recompute the worst neighbor, since duplicated and non reciprocal links detection was heavy optimized with probabilistic algorithms. --------- Co-authored-by: debing.sun <debing.sun@redis.com>