George Hotz
550cf2ca7f
tests from postopt ( #11964 )
...
* tests from postopt
* reraise is fine
2025-09-02 13:34:17 -07:00
qazal
b977ec0813
viz: axes domains cleanup ( #11962 )
2025-09-02 19:30:45 +03:00
nimlgen
897254ad6c
ci: add dev<->cpu copy speeds ( #11959 )
2025-09-02 15:22:44 +03:00
George Hotz
74040663bf
make ptrdtype a UOp property ( #11955 )
2025-09-01 16:35:43 -07:00
George Hotz
0dfca4e74b
add failing test for rangeify setitem ( #11954 )
2025-09-01 16:24:35 -07:00
wozeparrot
7c21271a5f
feat: end_lr envvar ( #11953 )
2025-09-01 14:53:07 -07:00
chenyu
6a40216724
correct bf16 fuzz input in test_dtype_alu ( #11933 )
...
it was using float16 inputs, now it's uint16 then convert to bf16
2025-09-01 10:52:26 -04:00
chenyu
965ea59b16
test_dtype_alu use AMD_LLVM from helpers ( #11950 )
2025-09-01 10:03:17 -04:00
b1tg
a9f07c31bc
fix amd llvm sqrt ( #11936 )
...
* fix amd llvm sqrt
* lint
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-09-01 09:31:14 -04:00
qazal
0a53e72f70
viz: fix trace duration in python test decoder ( #11949 )
2025-09-01 14:32:25 +03:00
qazal
27c9ed5a84
viz: more consistent naming of events ( #11948 )
...
* s/shapes/events in test_viz
* s/bufs/events in the memory packer
2025-09-01 14:16:47 +03:00
qazal
c7bb561ef9
remu: add v_rsq_f32_e32 instruction ( #11947 )
...
https://github.com/tinygrad/tinygrad/pull/11936 introduces a change to
the AMD LLVM renderer that outputs this instruction. Adding both 32 and
64 bit variants.
2025-09-01 11:29:31 +03:00
Sieds Lykles
d9560a631c
remove cast between ints if safe ( #11946 )
2025-09-01 05:56:49 +02:00
Sieds Lykles
a19d689481
fix vec dtype _min_max ( #11944 )
2025-09-01 03:24:07 +02:00
Sieds Lykles
f32f3464d6
Can safe cast from certain ints to floats ( #11941 )
...
* add rule
* add some tests
* prevent infinite loop with bfloat16
* add some ints to double and float can_safe_cast
* add tests
2025-09-01 00:51:24 +02:00
Sieds Lykles
1c6e43c203
Double cast is one cast if intermediate cast is safe ( #11939 )
...
* add rule
* add some tests
* prevent infinite loop with bfloat16
* prevent more infinite rewrite
2025-09-01 00:36:29 +02:00
wozeparrot
7e68045fb2
feat: small llama3 training ( #11829 )
2025-08-31 13:41:47 -07:00
nimlgen
020abe0556
hcq: finalize without synchronization when in error state ( #11872 )
...
* hcq: finalize without synchronization when in error state
* ooops
* fix
* fix
* fix
2025-08-31 18:39:13 +03:00
qazal
2004c9757d
tracing: add default clock ( #11935 )
2025-08-31 18:24:44 +03:00
b1tg
c1eeb3b99c
only skip AMD_LLVM ( #11934 )
...
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-08-31 18:15:47 +03:00
b1tg
75d380a77c
fix transcendentals in python renderer ( #11932 )
...
* fix transcendentals in python renderer
* add test
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-08-31 09:37:17 -04:00
Sieds Lykles
61e4dc6ad5
render special arg in cstyle if arg is UOp ( #11931 )
2025-08-31 07:01:29 +02:00
Sieds Lykles
d3252ccd85
fix special vmax when arg is UOp ( #11930 )
2025-08-31 06:54:39 +02:00
qazal
0bacd9fc9b
viz: give disassembly its own node ( #11927 )
2025-08-31 00:28:52 +03:00
chenyu
af89be317e
relax rtol for bfloat16 test_dtype_alu ( #11926 )
2025-08-30 17:16:08 -04:00
George Hotz
632c2fb119
lowerer works on rangeifed + print exception ( #11925 )
2025-08-30 12:05:44 -07:00
qazal
c27b99d68f
viz: refactor to indexed rewrite traces ( #11923 )
2025-08-30 20:01:47 +03:00
qazal
9aff00a6ea
switch viz command line args to pathlib ( #11922 )
2025-08-30 18:13:47 +03:00
qazal
c86ee5bfaf
viz: canonicalize device name colors ( #11921 )
2025-08-30 18:12:30 +03:00
nimlgen
a4f05ebd1a
ci: rebuild gpuocelot with boost libs ( #11920 )
2025-08-30 17:24:19 +03:00
qazal
bf0d055b39
viz: color by name ( #11919 )
2025-08-30 16:04:58 +03:00
Sieds Lykles
0bc34c000f
simplify range mod its own upper bound ( #11917 )
...
* add rules
* add tests
2025-08-30 08:37:35 +02:00
chenyu
561318fea7
Tensor.cos in test_stype_alu ( #11916 )
...
* Tensor.cos in test_stype_alu
* need this fix anyway
2025-08-29 20:26:36 -04:00
NoahKusaba
0838021753
remove np from beautiful_cifar ( #10988 )
...
* remove np from beautiful_cifar
* remove np from cifar
* rename variable and rename tensor.arrange to just tensor.randperm
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-08-29 19:34:16 -04:00
nimlgen
cf9d8c8142
ci: pin boost for macos runners ( #11910 )
2025-08-30 01:38:06 +03:00
nimlgen
c6e342cdac
mockgpu: no hang if gpuocelot failed ( #11915 )
2025-08-30 00:44:49 +03:00
chenyu
26d03a86a1
test_symbolic_ops.py cleanup ( #11895 )
2025-08-29 17:11:59 -04:00
b1tg
b2cc06218a
python bfloat16 ( #11912 )
...
* python bf16
* _to_torch_storage_type
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-08-29 15:18:02 -04:00
George Hotz
afad7d0cd1
remove dtype from range, it will be dtypes.index soon [pr] ( #11914 )
...
* remove dtype from range, it will be dtypes.index soon [pr]
* a few more
2025-08-29 09:52:07 -07:00
qazal
30e72d5820
multi device and copy tracing for NULL device ( #11913 )
...
* add device name to NULL programs
* trace transfers
2025-08-29 15:31:00 +03:00
qazal
d8e1e4dc61
tracing: show NULL programs ( #11911 )
2025-08-29 14:09:33 +03:00
nimlgen
75678b2cbe
amd: retire pm4 xcc sync ( #11835 )
...
* amd: aql default when several xccs
* amd: retire om4 xcc sync
* remove more
* more
* more
2025-08-29 09:56:27 +03:00
George Hotz
394c2d1db1
update Kernel API in tests + move optimize_local_size ( #11907 )
2025-08-28 15:12:47 -07:00
nimlgen
fa695ac1ce
ci: mac gpuocelot ( #11906 )
...
* gm
* fix?
* ops
* imp
* xx
* add file
2025-08-28 23:29:43 +03:00
George Hotz
b9b438c516
small updates from postopt ( #11903 )
...
* tests from postopt
* modernize
* skip lin tests
* that's fixed?
* skip, not failure
2025-08-28 12:34:52 -07:00
nimlgen
bb55a3001f
nv: flush reset message ( #11897 )
2025-08-28 22:17:20 +03:00
nimlgen
e8289c75b1
ci: do not reinstall existing pkgs in macos ( #11900 )
2025-08-28 21:20:15 +03:00
chenyu
134cf56904
update cache name for gpuocelot ( #11896 )
2025-08-28 13:11:10 -04:00
Ben Waldron
ea1be2e4cd
[bounty] Remove using reshape to register symbolic shape ( #11771 )
...
* Modify tests and start work towards removing symbolic reshape
* Refactor symbolic reshape
* fix small error
* much cleaner + fix more tests
* Can remove this now
* Update test_symbolic_ops and test_tiny
* Couple more tests
* Unused import
* More tests and add EXPAND to Tensor.empty
* Fix test beam search
* all int
* Fix rangeify by adding shrink
* Remove OOB check and so fix test_symbolic_jit
* test_symbolic_jit doesn't need OOB Context anymore either
* Should remove that test now
* Cleanups part 1
* fix linters
* Final cleanups
* Don't reassign inside for loop
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-08-28 12:30:49 -04:00
qazal
53853ae49b
viz: switch to Path2D ( #11892 )
2025-08-28 18:58:16 +03:00