* fix correct indentation in UOp pretty_print for repeated references
When a UOp was referenced multiple times, the walrus operator notation
(e.g., x0:=) was correctly used for the first occurrence, but subsequent
references had misaligned indentation due to an extra space character.
Fix indentation misalignment in pretty_print() when UOps are referenced
multiple times.
* add simple unit tests for UOp repr
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* write python emulator from RDNA3 psuedocode in pdf
* emu2
* more emu
* working
* more psueod
* progress
* cleanups
* delete junk
* delete stale files
* just emu
* work
* emu compare
* bemu
* cleanups and more failures
* revert bench emu
* fix emu cmp
* four tests fail
* bugfixes
* dsl
* ext
* refactor
* dsl
* div scale fix
* test_emu
* fix emu tests
* pcode
* test pcode
* top imports
* fix test_emu to use run_asm
* emu tests on real hardware
* more tests
* more emu tests
* more
* work
* work
* bug fix
* bugfixes
* fix fp16 gemm
* all ops tests pass in emulator
* fix llvm tests
* fix a few more tests
* fix mockgpu timeout
`UOp.render()` crashed with `IndexError: tuple index out of range` when
the UOp graph contained a `VECTORIZE` with empty `src=()`. This occurs
when reshaping to scalar shape `()`, e.g., `Tensor.ones(4).sum()`.
The bug was in the renderer's VECTORIZE pattern: `all_same(())` returns
`True` (vacuous truth), causing the code to access `x.src[0]` on an
empty tuple.
- Fix `IndexError` when calling `UOp.render()` on graphs containing
empty `VECTORIZE` nodes.
- Add test for empty `VECTORIZE` rendering.
* having fun with python asm dsl
* rdna3
* meh
* all in rdna3
* work
* more work
* work
* integration
* tests
* simpler
* simpler
* asm
* better
* simpler
* progress
* emu
* simpler
* emu
* tests
* types
* vopd
* cleaups
* work
* memory ranges
* add tracing
* refactors
* run_asm exit
* more readable
* compare to remu
* test gemm
* bug + stale
* more tests
* refactor
* tests fix
* more ins
* more instructions
* refactor
* faster
* match case
* match case
* simpler
* work
* tests
* run_asm
* work
* bug fixes
* more emu
* alu/emu
* refactor
* no pipeline emu yet
* alu direct
* fix
* bugfixes + new test
* fix exceptions in emulators
* update gen.py
* pylint
* no pdf
* improve bench_emu
* speedups
* cleanups
* more tests
* remove ExecItem and merge it with ScheduleItem
* less diff
* fix issues
* min diff
* don't change bufs in _lower
* min diff
* update
* revert
* fixes
* diff
* support symbolic shapes in split/chunk when split dim is concrete
Previously split() and chunk() required all dimensions to be concrete.
Now they only require the dimension being split to be concrete, allowing
them to work with tensors that have symbolic shapes in other dimensions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* update CLAUDE.md: add pre-commit and no-amend rules
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* fix dim resolution order in split/chunk
Ensure dim_sz is retrieved after dim is resolved, not before.
The previous one-liner evaluated self.shape[dim] with the original
unresolved dim value.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* qwen model is mostly working
* add Q4_K quantization support to GGUF parser, add qwen3:1.7b model
- Add Q4_K (type 12) dequantization in nn/state.py
- Add qwen3:1.7b model using Q4_K_M quantization (smaller than Q8_0)
- Make bos_token_id optional for models like Qwen3 that don't have it
- Fix line length issues and add preset parameter to SimpleTokenizer
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* smaller diff
* test dequant
* half split
* better
* simple tok
* mock token
* polish
* better
* fix
* replace
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* schedule: cache unbinds for consistent cache keys
strip BIND values before computing cache key so different bound values
(e.g. KV cache positions) hit the same schedule cache entry.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* spec: allow single-src BIND for schedule cache key normalization
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: add lessons learned to CLAUDE.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* more claude.md
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>