chenyu
14d1c5fdfd
assign fusion tests on detach and contiguous_backward ( #15092 )
2026-03-02 15:21:51 -05:00
nimlgen
dfa180413d
tbgpu: sign nv ( #15087 )
2026-03-02 22:58:30 +03:00
chenyu
71f228f80f
test exact kernel count in torch_backend/test_kernel_fusion ( #15091 )
2026-03-02 14:26:32 -05:00
chenyu
f80b1033c5
simpler Tensor.all ( #15089 )
...
same generated kernel
2026-03-02 11:08:55 -05:00
chenyu
4008f7d4e8
move Tensor.one_hot +1 to python ( #15088 )
2026-03-02 10:56:41 -05:00
nimlgen
dafbe9733a
am: cleanup ( #15086 )
2026-03-02 17:06:21 +03:00
qazal
f7aeff6061
viz: cli.py cleanups, do not require PYTHONPATH ( #15085 )
...
* cleanup the print
* sys.exit
* equal check
* cleanup unpacker
* cli doesn't need PYTHONPATH
* no semicolons
* %s/PYTHONPATH=. //g
2026-03-02 19:24:38 +09:00
George Hotz
5ff278446c
add contiguous_view_offset ( #15084 )
...
* add contiguous_view_offset
* no int
2026-03-02 18:05:04 +08:00
Christopher Milan
977c270774
IMAGE=1 kernel count failing tests ( #15083 )
2026-03-02 04:35:26 -05:00
George Hotz
3539693555
Support triu variable on diagonal + SDPA symbolic ( #15081 )
...
* triu variable
* fails
* dumbbb
* no commutative in reshape
* real fix
* revert that
* sdpa symbolic tests
2026-03-02 12:19:48 +08:00
wozeparrot
a4f6365929
llama3: fstep takes grads ( #15069 )
2026-03-01 20:05:07 -08:00
Nick
8e8e9f6ff6
assert removal for _tri() + tests ( #15073 )
...
* assert removal for _tri() and tests
* removed import
* tests triu/tril like in prefill
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-03-02 10:34:28 +08:00
nimlgen
ccbbca05ef
beam: add dev_timeout for am ( #15063 )
...
* beam: add dev_timeout for am
* all covered
* fk
* x
* fuzz
* reset
* f
2026-03-01 16:57:29 +03:00
chenyu
8cb4368967
delete unused END NOOP rule [pr] ( #15077 )
2026-03-01 00:09:05 -05:00
chenyu
efce99adc9
skip isComposing key press in llm.py ( #15076 )
...
for the CJK input user
2026-02-28 20:31:53 -05:00
chenyu
103ea16ec0
add contiguous back to svd ( #15074 )
...
can cause infinite loop
2026-02-28 16:49:26 -05:00
chenyu
fe0fa8333b
Revert "improve Tensor.sort indices ( #15070 )" ( #15072 )
...
This reverts commit e3003631f2 .
2026-02-28 14:40:30 -05:00
chenyu
e3003631f2
improve Tensor.sort indices ( #15070 )
...
* improve Tensor.sort indices
instead of N^2 match at the end, have an arange to start and go through the same N(logN)^2 path
* contiguous
2026-02-28 14:16:16 -05:00
wozeparrot
cfc5cf65ad
llama3: vocab padding fix + jit copies on fakedata ( #15067 )
2026-02-28 08:44:55 -08:00
chenyu
76170d035a
relax atol for test_xlm_roberta_large ( #15066 )
2026-02-28 11:22:35 -05:00
qazal
cfb8e6922d
viz: arrow keys move through time ( #15064 )
...
* work
* automatic zoom, keeping scale
* the whole shape should be out of view
2026-02-28 23:52:36 +09:00
nimlgen
9b3450c9da
test gpu crash on cdna ( #15062 )
2026-02-28 13:17:59 +03:00
nimlgen
6bbf813dd3
ci: switch to tinygrad/amdcomgr_dylib ( #15061 )
2026-02-28 13:09:39 +03:00
nimlgen
77846300b2
am: reset vm fault ( #15060 )
2026-02-28 12:58:56 +03:00
George Hotz
dc54441e1f
add better printing to tinygrad.apps.llm ( #15059 )
...
* add better printing to tinygrad.apps.llm
* add gc.collect
* comment
2026-02-28 16:38:50 +08:00
George Hotz
bb84e389cf
functions for llama trainer ( #15045 )
...
* functions for llama trainer
* function there
* axis match
* fix multi
* lil cleaner
* there's a bug with HK_FLASH_ATTENTION
* training functions
* for commit
2026-02-28 12:15:18 +08:00
chenyu
9b4ba3f838
remove ReduceContext.range_to_ends [pr] ( #15055 )
...
* remove ReduceContext.range_to_ends [pr]
make merge_reduce_ends pure. this state is causing issue when introducing more reduce merging rewrites
* tag
2026-02-27 22:15:44 -05:00
chenyu
151608aa90
update test_multiple_to_single_device ( #15056 )
...
follow up to #14482 , add SCACHE=0 to the test
2026-02-27 21:44:33 -05:00
chenyu
5fd06f4f02
differentiable setitem ( #15054 )
...
* differentiable setitem
go through the where path for bw
* no return
2026-02-27 17:25:15 -05:00
chenyu
db6b3e1edc
fix mixed setitem with both basic and tensor indexing ( #15050 )
2026-02-27 15:35:48 -05:00
chenyu
c9f6d8751b
don't remove_bufferize for Invalid ( #15053 )
...
* don't remove_bufferize for Invalid
* replaced
2026-02-27 15:16:09 -05:00
qazal
b8a55d5f68
sqtt: new packet types, add discovery script ( #14960 )
2026-02-28 04:27:27 +09:00
nimlgen
4e12fc3fe6
am: mi3xx recovery ( #15051 )
2026-02-27 22:10:47 +03:00
chenyu
81a35cef38
rearrange Tensor.getitem code ( #15049 )
...
no-op change to prepare setitem fix
2026-02-27 12:57:16 -05:00
chenyu
1406d49eef
failed test cases for advanced setitem ( #15048 )
2026-02-27 10:50:18 -05:00
qazal
ef1017f7ed
viz: skip drawing offscreen tracks in profiler ( #15047 )
2026-02-27 22:19:08 +09:00
qazal
ad99b77f6d
assembly/amd: add gfx12_asm_vflat llvm tests, disasm fixes ( #15046 )
...
* add gfx12_asm_vflat.s
* work
2026-02-27 20:20:31 +09:00
George Hotz
010d2790ce
fix multi minimal ( #15044 )
2026-02-27 14:31:58 +08:00
George Hotz
3e1e12528c
hotfix: disable tinyfs load test
2026-02-27 12:04:41 +08:00
George Hotz
d23b79530e
remove disk from GGUF GEMV test ( #15041 )
...
* remove disk from GGUF GEMV test
* keep copy
2026-02-27 12:03:00 +08:00
chenyu
d345f7f5dc
remove _pending_assigns ( #15040 )
2026-02-26 22:38:10 -05:00
George Hotz
37e31e7da4
gguf gemv test ( #15039 )
...
* add gemv tests
* gguf big
* skip
* make realize optional
2026-02-27 10:54:43 +08:00
Nick
af94bfc401
fix retinanet shared memory race condition in parallel tests ( #15030 )
...
Append PID to shared memory names in batch_load_retinanet to prevent
FileExistsError when pytest-xdist runs multiple test workers that each
call _setup_shared_mem with the same hardcoded name.
2026-02-27 08:36:24 +08:00
George Hotz
2bbf8bbefa
improve call/param rendering ( #15023 )
2026-02-27 08:35:04 +08:00
chenyu
0f94a4bb73
failed test case for early fixup const copy ( #15038 )
...
* failed test case for early fixup const copy
wrong with PAD
* test no copy
2026-02-26 19:09:33 -05:00
chenyu
3a4db53b43
raise RuntimeError in schedule for conflicted var_val [pr] ( #15031 )
2026-02-26 15:16:01 -05:00
qazal
d65db32395
viz: only compute aggregate memory graph, defer n² per buffer graph ( #15029 )
2026-02-27 04:14:51 +09:00
qazal
c61fe57cfd
viz: fix n² tiny device linking in profiler ( #15028 )
2026-02-27 02:25:39 +09:00
qazal
88d650d606
viz: clean up call node detection check ( #15025 )
2026-02-26 19:57:56 +09:00
qazal
1c09890f66
sqtt: map instructions in the command line tool ( #15024 )
2026-02-26 12:34:24 +02:00