* fix correct indentation in UOp pretty_print for repeated references
When a UOp was referenced multiple times, the walrus operator notation
(e.g., x0:=) was correctly used for the first occurrence, but subsequent
references had misaligned indentation due to an extra space character.
Fix indentation misalignment in pretty_print() when UOps are referenced
multiple times.
* add simple unit tests for UOp repr
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* qwen model is mostly working
* add Q4_K quantization support to GGUF parser, add qwen3:1.7b model
- Add Q4_K (type 12) dequantization in nn/state.py
- Add qwen3:1.7b model using Q4_K_M quantization (smaller than Q8_0)
- Make bos_token_id optional for models like Qwen3 that don't have it
- Fix line length issues and add preset parameter to SimpleTokenizer
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* smaller diff
* test dequant
* half split
* better
* simple tok
* mock token
* polish
* better
* fix
* replace
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* schedule: cache unbinds for consistent cache keys
strip BIND values before computing cache key so different bound values
(e.g. KV cache positions) hit the same schedule cache entry.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* spec: allow single-src BIND for schedule cache key normalization
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: add lessons learned to CLAUDE.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* more claude.md
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* schedule: cache unbinds for consistent cache keys
different bound variable values (e.g. kv cache positions) now produce
the same schedule cache key by unbinding BIND(DEFINE_VAR, CONST) before
computing the cache key and rebinding after lookup.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* schedule: cache unbinds for consistent cache keys
When scheduling, BIND(DEFINE_VAR, CONST) nodes are now unbound to
tagged DEFINE_VARs before computing the cache key. This ensures that
the same computation with different bound values (e.g., different
KV cache positions in LLM) gets the same cache key and reuses the
cached schedule.
The fix:
- pm_pre_sched_cache: replaces BIND with tagged DEFINE_VAR
- pm_post_sched_cache: restores tagged DEFINE_VAR back to original BIND
- pm_remove_rangeify_tags: excludes DEFINE_VAR to preserve tags through rangeify
- var_vals extracted from BINDs before cache key computation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* schedule: fix BIND handling and add CLAUDE.md
- Handle BIND to RANGE in create_schedule (not matched by CONST pattern)
- Assert all BINDs on same variable have same value
- Add CLAUDE.md codebase guide
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* llm: add created/model fields, non-streaming support, and tests
- Add `created` timestamp and `model` fields to response (required by OpenAI spec)
- Add non-streaming mode support for /v1/chat/completions
- Add `send_data` helper to HTTPRequestHandler for responses with Content-Length
- Refactor viz/serve.py to use send_data
- Add integration tests using real OpenAI client
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* add openai to testing
* toml
* Remove 'openai' from dependencies
Removed 'openai' from the dependencies list.
* bump cache
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* start work on schedule cache
* local unique
* schedule cache works
* schedule cache cleanup
* fix tests
* preserve metadata
* oops, fix cache
* put that there
* fix spec
* always miss
* why is that broken?
* src[0].op
* fix process replay
* delete abstractions2
* reenable the actual schedule cache
* metadata is best effort
* fix JIT in examples/gradaccum_mnist.py
* full jit
* fixed and test is real
* more work parsing SQTT
* more minimal runner
* sep VIZ/PROFILE
* parse print new
* improve parser
* more filter
* that
* split them
* lil cleanup
* skip flaky test
* AQL in mmapeak