George Hotz
c331798201
move tests to test/backend ( #14691 )
...
* move tests to test/backend
* fix imports
* fix CI
* revert that one
* Fix formatting in README for test command
2026-02-12 11:09:44 +08:00
wozeparrot
4b5d3bda1f
llama3: data seed ( #14681 )
2026-02-11 19:04:40 -08:00
chenyu
0c63f63ee4
recursive resolve assign dependency ( #14688 )
...
remove the .realize in llm.py
2026-02-11 17:41:05 -05:00
nimlgen
869083e373
nv: pciiface pma ( #14686 )
...
* x
* w
* z
* clean
* o
* r
* x
* c
* r
* list
* deanon
* b
2026-02-11 23:29:07 +03:00
chenyu
cbbc2fdea5
update test_assign_slice_then_read ( #14687 )
...
passes locally now
2026-02-11 15:02:44 -05:00
chenyu
7465b22ba0
handle setitem target in rangeify ( #14685 )
2026-02-11 11:38:59 -05:00
chenyu
0d215b962e
few setitem test cases diff from numpy ( #14684 )
...
have claude fuzzed frontend and found some real bugs
2026-02-11 08:41:03 -05:00
nimlgen
df8b21eeb5
add real self assign test ( #14683 )
...
* self assign fix
* no
2026-02-11 12:41:53 +03:00
wozeparrot
a60220bed9
llama3: move dl to numpy & jit more ( #14677 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2026-02-10 18:16:40 -08:00
George Hotz
4565958792
some lil speedups ( #14679 )
2026-02-11 10:01:58 +08:00
George Hotz
2d4ad9e739
add a waitlist for graph rewrite ( #14678 )
...
* add a waitlist for graph rewrite
* cleaner
* one context on spec check
2026-02-11 09:30:13 +08:00
Christopher Milan
389e2eeda1
Revert "transcendental works with long decomp" ( #14676 )
2026-02-10 19:46:34 -05:00
Christopher Milan
0662c8037d
transcendental works with long decomp ( #14672 )
2026-02-10 19:30:24 -05:00
George Hotz
3fab43c57c
add cache to asm gemm ( #14675 )
2026-02-11 08:26:30 +08:00
chenyu
ebef63dba0
update test_self_assign_same_device_copy ( #14673 )
...
that test would have passed without the optimization because .to shortcut
2026-02-10 17:23:43 -05:00
nimlgen
aafa9dcb5b
eliminate same-device copy self-assigns ( #14671 )
...
* eliminate same-device copy self-assigns
* ugh
2026-02-10 22:54:51 +03:00
chenyu
494eec2694
test_setitem_const_fused ( #14668 )
...
did not realize #14640 also fixed #10690 , so added a test for it
2026-02-10 08:33:02 -05:00
nimlgen
42ded7c34d
amd: bind aql ( #14666 )
...
* amd: bind to aql
* bind
* x
* f
2026-02-10 16:28:11 +03:00
George Hotz
82974929b7
use PARAM in schedule ( #14665 )
...
* use PARAM in schedule
* create_new_buffer
2026-02-10 19:18:40 +08:00
George Hotz
8dc46dde07
everything has dtype.long now ( #14661 )
...
* everything has dtype.long now
* int64/uint64 are everywhere now
* that doesn't work
2026-02-10 15:08:50 +08:00
Christopher Milan
cdb78954cb
better cl compiler name ( #14660 )
...
cl_compiler instead of compiler because overriding Compiled.compiler seems more confusing
2026-02-10 01:03:46 -05:00
George Hotz
cc9bf8ccbc
move more to null/unit tests ( #14658 )
...
* move more to null tests
* move test_gc
* no test fusion op
2026-02-10 13:35:17 +08:00
chenyu
83f6d28579
two less realize in setitem ( #14655 )
2026-02-09 23:45:24 -05:00
wozeparrot
69574542ab
fix: use correct fa implementation in eval ( #14651 )
2026-02-09 18:20:44 -08:00
chenyu
0dedf4063c
minor test_setitem cleanup ( #14654 )
2026-02-09 20:40:29 -05:00
Christopher Milan
b36b62eb59
don't push docker cache for PRs ( #14652 )
2026-02-09 19:55:55 -05:00
Christopher Milan
e6562a5061
remove CompilerPair ( #14638 )
2026-02-09 19:51:18 -05:00
Christopher Milan
396e1320fb
bump cache version for z3 ( #14650 )
2026-02-09 19:32:07 -05:00
chenyu
9e3f24db9f
assign realize fix ( #14649 )
...
fix the need for explicit assign. track pending assigns for each buffer, and run those before the main realize in order
2026-02-09 17:46:46 -05:00
chenyu
0913c068ea
clean up setitem disk path ( #14648 )
2026-02-09 15:58:04 -05:00
chenyu
205a1212b7
delegate non Tensor src setitem to assign ( #14647 )
...
cannot do this for DISK in the unified path
2026-02-09 13:53:20 -05:00
chenyu
e9f40f49d4
explicitly check advanced setitem ( #14644 )
...
advanced setitem DISK would failed in rangeify with bad error, now it's checked directly in setitem. eventully DISK can use regular setitem path
2026-02-09 13:36:46 -05:00
chenyu
20a132b1c4
relax atol for test_uop_scan_matmul ( #14646 )
...
flaky, also log max diff
2026-02-09 13:25:19 -05:00
qazal
50d3f6cea5
EVAL_BS=0 in llama profile ( #14643 )
2026-02-10 00:49:43 +09:00
chenyu
8a2c23d3dc
raise RuntimeError for setitem dtype mismatch ( #14642 )
2026-02-09 10:37:08 -05:00
qazal
80b0119cef
llama: add new asm gemm shape ( #14611 )
...
* llama: add new asm gemm shape
* work
* cleanup
* half dtype
* more comment
2026-02-10 00:34:29 +09:00
chenyu
a49e038c0c
dont manually broadcast in setitem ( #14641 )
...
handled by assign
2026-02-09 09:34:09 -05:00
chenyu
2c3e3559eb
remove a contiguous in basic setitem ( #14640 )
...
handled in rangeify
2026-02-09 09:19:46 -05:00
chenyu
6c0c8e2ac3
setitem push a realize to basic setitem ( #14637 )
...
advanced setitem does not need it
2026-02-09 08:54:07 -05:00
nimlgen
e087c58ae0
print tables in llama/profile.sh ( #14639 )
2026-02-09 12:32:54 +03:00
Christopher Milan
27f7ea478b
new style DSP renderer ( #14636 )
...
* new style DSP renderer
* cleanup
2026-02-09 00:39:03 -05:00
Christopher Milan
efac5b9ef6
new style NV/CUDA renderers, try 2 ( #14634 )
...
* new style NV/CUDA renderers, try 2
* fix diskcache
2026-02-08 22:58:48 -05:00
Christopher Milan
0ebb508b85
new style metal compiler ( #14632 )
2026-02-08 21:58:25 -05:00
Christopher Milan
9eef9f38ad
new style python renderer ( #14631 )
2026-02-08 21:45:07 -05:00
Christopher Milan
5f2f2cc956
Revert "new style NV/CUDA renderers ( #14627 )" ( #14633 )
...
This reverts commit 0e505951b0 .
2026-02-08 21:16:03 -05:00
Christopher Milan
4ad787ece2
new style CPULLVMRenderer ( #14629 )
2026-02-08 21:05:01 -05:00
Christopher Milan
0e505951b0
new style NV/CUDA renderers ( #14627 )
...
* new style NV/CUDA renderers
* fix pickle
* oops
* fix CUDA_CC=NVCC
* mockgpu uses PTXCompiler
* oops
* ruff
* dont discard stderr
* ugh
2026-02-08 21:04:51 -05:00
Filip Brzek
1667669c46
fix: python3 -m tinygrad.device reporting on AMD/CPU ( #14622 )
...
* test: device module expects PASS in -m tinygrad.device for CPU
* fix: use device._compiler_name instead of unwrap_class_type(compiler).__name__ in enumerate_devices_str
2026-02-08 20:22:35 +03:00
nimlgen
01a4ee4d66
do not hive_reset when amdgpu ( #14624 )
2026-02-08 19:14:13 +03:00
nimlgen
a615b9d781
am: f8_mode for gfx94x only ( #14620 )
2026-02-08 17:38:48 +03:00