Add ArcticInference doc (#9492)

2026-01-10 07:18:10 -05:00 · 2025-07-01 14:15:13 -04:00
parent 6da7e051be
commit e05e627957
1 changed files with 4 additions and 0 deletions
--- a/docs/usage/llms/local-llms.mdx
+++ b/docs/usage/llms/local-llms.mdx
@@ -175,6 +175,10 @@ vllm serve mistralai/Devstral-Small-2505 \
    --enable-prefix-caching
 ```

+If you are interested in further improved inference speed, you can also try Snowflake's version
+of vLLM, [ArcticInference](https://www.snowflake.com/en/engineering-blog/fast-speculative-decoding-vllm-arctic/),
+which can achieve up to 2x speedup in some cases.
+
 ### Run OpenHands (Alternative Backends)

 #### Using Docker