mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-01-09 15:08:02 -05:00
qwen model is working (#13690)
* qwen model is mostly working * add Q4_K quantization support to GGUF parser, add qwen3:1.7b model - Add Q4_K (type 12) dequantization in nn/state.py - Add qwen3:1.7b model using Q4_K_M quantization (smaller than Q8_0) - Make bos_token_id optional for models like Qwen3 that don't have it - Fix line length issues and add preset parameter to SimpleTokenizer 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * smaller diff * test dequant * half split * better * simple tok * mock token * polish * better * fix * replace --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -10,6 +10,7 @@ class TestLLMServer(unittest.TestCase):
|
||||
cls.mock_tok.role = Mock(return_value=[100, 101])
|
||||
cls.mock_tok.encode = Mock(return_value=[200, 201, 202])
|
||||
cls.mock_tok.decode = Mock(return_value="Hello")
|
||||
cls.mock_tok.end_turn = Mock(return_value=[998])
|
||||
|
||||
cls.mock_model = Mock()
|
||||
cls.mock_model.generate = Mock(side_effect=lambda ids, **kwargs: iter([300, 301, 999]))
|
||||
|
||||
Reference in New Issue
Block a user