Ignacio Sica
b240f12593
[TIP-9] rename Opt's amt to arg 2 ( #8770 )
...
* rename Opt amt to arg
* ignore_beam_cache for test_tiny
* move ignore_beam_cache to test_tiny
* move to separate pr
* revert space change
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-01-27 14:19:04 -05:00
George Hotz
3ed146a5ff
Revert "rename Opt amt to arg ( #8767 )" ( #8769 )
...
This reverts commit bf041659a5 .
2025-01-27 23:46:37 +09:00
Ignacio Sica
bf041659a5
rename Opt amt to arg ( #8767 )
2025-01-27 23:36:47 +09:00
George Hotz
b4bf6a7dea
switch backward to use gradient [pr] ( #8235 )
...
* switch backward to use gradient [pr]
* set device correctly, dedup
* why does that fail?
* add noop cast
* simple backward
* fix beautiful_mnist
* touchups
* set in compute_gradient
* uop_count
* uop_count was wrong
* collections
* no note
* skip that test
* update sched kernel counts
* train mnist is 65
* fix metadata and gc
* fixes
* materialize_grads
* no pathlib stuff
* add contiguous_backward, fix bugs
* add some realize
* fix multi
2025-01-26 09:12:16 +09:00
George Hotz
46a8c5e1e5
delete forced_realize ( #8615 )
...
* delete forced_realize
* put that back
* expectedFailures
* cleaner create_subbuffer
* more comments
---------
Co-authored-by: qazal <qazal.software@gmail.com >
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2025-01-20 09:40:36 -08:00
qazal
d957a4f108
add tests for div buffer collapsing in the scheduler [pr] ( #8671 )
...
* add tests for mul/div buffer collapsing in the scheduler [pr]
* lint
* merge with test_linearizer's version of this
* 4*3
2025-01-18 14:15:29 -05:00
ignaciosica
d2234e308a
tf32 tc for nv and ptx ( #8635 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-01-17 17:43:57 -08:00
qazal
ae2229d727
assert kernel buffer limit at compile time [pr] ( #8595 )
...
* remove the BUF_LIMIT assert
* skip the base one
2025-01-13 16:32:07 -05:00
qazal
586e730d32
use UOp.st for kernel reduce axes ( #8499 )
...
* use UOp.st for kernel reduce axes [pr]
* do not return dict
2025-01-13 06:24:11 -05:00
qazal
866dfa1f23
create_schedule([x.lazydata]) -> x.schedule() in tests ( #8449 )
2024-12-31 03:15:52 +08:00
George Hotz
29c14f1cbf
hotfix: update tests for no uop mut
2024-12-30 10:05:37 -05:00
ignaciosica
ba0c844a83
special tol when f16 and bf16 are tc input dtypes ( #8183 )
2024-12-21 11:32:26 -05:00
George Hotz
bd9c015b09
tests from grad uop path [pr] ( #8313 )
2024-12-18 09:25:05 -08:00
Ahmed Harmouche
a73e3677d0
Test linearizer on webgpu ( #8159 )
...
* Test linearizer on wgpu
* Skip tests due to exceeded dims
2024-12-11 17:03:26 +01:00
qazal
6be388be86
failing test for const folding breaking indexing [pr] ( #8103 )
2024-12-07 19:55:02 +08:00
George Hotz
0c7477b108
no bool in range [pr] ( #7988 )
...
* no bool in range [pr]
* fix llvm
* add arg to range spec
* fix broken test
* forgot this one
* hotfix: test_tiny jit is a real test
2024-12-02 19:05:16 +08:00
George Hotz
f17af70d17
replace all sparents with toposort ( #7983 )
2024-12-02 15:00:30 +08:00
George Hotz
c5c3b05b5a
block lin: only the test changes ( #7933 )
2024-11-28 13:19:00 +08:00
George Hotz
32dbab945c
Revert "add block uops and modify tests ( #7931 )" ( #7932 )
...
This reverts commit 6f4519ff45 .
2024-11-28 13:15:41 +08:00
George Hotz
6f4519ff45
add block uops and modify tests ( #7931 )
2024-11-28 13:11:18 +08:00
chenyu
a58e289d77
Revert "prereqs for new block lin so PR works ( #7919 )" ( #7921 )
...
This reverts commit c53261b541 .
2024-11-27 08:41:09 -05:00
George Hotz
c53261b541
prereqs for new block lin so PR works ( #7919 )
2024-11-27 15:07:54 +08:00
ignaciosica
fc3154a7b3
metal bf16 tc support [pr] ( #7408 )
...
* add bf16 tc for metal
* hotfix: spacing
* fix tolerance and skip metal bf16 in ci
* hotfix: check for dtype_out
* hotfix: add check for tc.dtype_out is bf16 back
* hotfix: add parens
2024-11-20 14:39:08 -05:00
George Hotz
bc977fec53
dname -> device [pr] ( #7804 )
...
* dname -> device [pr]
* a few more
* only one left
2024-11-20 17:57:14 +08:00
geohotstan
8100109c9d
Add replicate mode to Tensor.pad ( #7608 )
...
* base implementation
* add tests
* actually remove the assertionerror test
* actually only have reflect for this pr
* change the 4 if-else one liner
* maybe use a lambda
* fix
* maybe a lil cleaner
* fix tests
* complete
* small change
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-11-18 10:55:38 -05:00
ignaciosica
597a239e28
Remove UnaryOps, BinaryOps, TernaryOps, MetaOps [pr] ( #7725 )
...
* remove unaryops
* remove ternaryops
* remove metaops
* hotfix
* remove binaryops
* hotfix: test_pattern_matcher
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-11-16 20:56:56 +08:00
George Hotz
0a411b4f68
replace llvm with new llvm ( #7616 )
...
* replace llvm with new llvm
* fix test_linearizer
* minor fixups
* fix alloca
* don't use alloca
* fix DEFINE_ACC
* lines
* comments and lines
* a little tighter
2024-11-10 11:28:52 +08:00
Ahmed Harmouche
e35226e698
Remove Ops.ALU ( #7595 )
2024-11-08 19:52:14 +08:00
Carl Basho
630a7f37cf
update tests ( #7554 )
...
Co-authored-by: John Doe <null@mail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-11-05 11:35:15 -05:00
George Hotz
99bd4372a5
Ops.ALU is no more, the arg is just an op ( #7525 )
...
* op arg alu [pr]
* more
* more passing
* fix more tests
* more tests passing
* fix single failing test
* so much cleaner
* noop to not have process replay trigger
* fix ptx
2024-11-05 00:22:22 +08:00
George Hotz
c8bf09b7d4
s/UOps/Ops ( #7500 )
...
* s/UOps/Ops [pr]
* fix
2024-11-03 11:26:10 +08:00
George Hotz
4cb236a495
index in cstyle ( #7328 )
...
* index only in cstyle
* fix prefix dtypes
* fix tests
* global indexing
* Revert "global indexing"
This reverts commit 4d507e8abb .
* fix image
* fix image
* ptx tests
* fix CUDA dtype rendering
2024-10-29 13:06:26 +08:00
George Hotz
4812801aa6
try for canonical order ( #7286 )
...
* try for canonical order
* cmp better
* disable bad tests
* flip const order
* fix test
* fix tests
* different fix for NOOP
* metaclass here
* fix tests
* narrower scope
2024-10-25 16:04:54 +08:00
qazal
d2b608233a
get outbufs by globals idxs [pr] ( #7233 )
2024-10-23 16:06:35 +03:00
George Hotz
b0a13896d7
PtrDType is dataclass [pr] ( #7125 )
...
* PtrDType is dataclass [pr]
* new dataset
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-18 09:40:33 -04:00
George Hotz
ded1b38b84
minor dtype cleanup [pr] ( #7124 )
...
* minor dtype cleanup [pr]
* use ptr() function
2024-10-17 17:41:23 +08:00
George Hotz
a71bb09ec3
remove symbolic file [pr] ( #7012 )
2024-10-12 18:44:44 +08:00
qazal
20d3c2d113
unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW ( #6955 )
...
* add UOps.VIEW
* update hardcoded asts
* update sops.gz
2024-10-09 02:00:17 +08:00
qazal
391497a311
schedule independent of Device [run_process_replay] ( #6829 )
2024-10-01 14:46:26 +08:00
George Hotz
50dd6bd951
move cmp tuple out [run_process_replay] ( #6825 )
...
* move cmp tuple out [run_process_replay]
* was unneeded
2024-10-01 10:38:28 +08:00
qazal
e7fcbe1a4d
refactor test_linearizer correctness asserts ( #6812 )
2024-09-30 15:31:02 +08:00
qazal
e0d8685c99
test_masked_upcast_wino check device buf_max ( #6723 )
2024-09-25 11:26:53 +08:00
George Hotz
7c38121280
load penalty ( #6681 )
...
* bias/bn loads after loops
* load penalty in fix_priority
* more generic test
2024-09-23 18:12:12 +08:00
qazal
982086f54c
UOps.VALID try 2 ( #6623 )
...
* make UOps.VALID compile
* fixable tests
* bufs dedup
* cleanup the CONST spec
* regenerate dataset with graph_rewrite
```py
def rewrite_const(const:UOp, st_src:UOp) -> UOp:
st: ShapeTracker = st_src.arg
return UOp(UOps.VALID, dtypes.bool, (st.to_uop(),)).where(UOp.const(const.dtype, const.arg), UOp.const(const.dtype, 0))
pm = PatternMatcher([(UPat(UOps.CONST, name="const", src=(UPat(UOps.SHAPETRACKER, name="st_src"),)), rewrite_const)])
```
* rm arg
* remove arg
* revert arg removal
This reverts commit 2c35c75c95 .
* red test_pickle_define_var
2024-09-21 14:19:25 +08:00
George Hotz
42ba887daa
remove logic to vectorize reduces ( #6536 )
...
* remove logic to vectorize reduces
* fix tests
2024-09-16 14:04:48 +08:00
ignaciosica
c447ec2190
Fix amx shape [run_process_replay] ( #6524 )
...
* fix amx shape (sz,sz,sz) -> (sz,sz,1)
* revert check
2024-09-16 09:49:55 +08:00
George Hotz
bdd0c06f29
add void type to uop ( #6471 )
...
* unwrap_dtype maybe
* uopgraph stuff that hardcoded None
* test_ops passes
* dtypes.py fixups
* update test_linearizer and friends
* more ast updates
* test_beam and test_schedule too
* add void type to uop [run_process_replay]
* remove dumb casts
* start making it green
* more cast cleanups
* more cls methods to fix
* regenerate dataset
* split UOp and NOp const
* maybe that too
* fix docs
* update test_uop_symbolic
* test_verify_ast
* new sops with no diff
* meh, type_ignore is alright
* remove that assert
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-11 18:16:28 +08:00
chenyu
e0d35e3657
update test_padto_sum_not_ok ( #6450 )
...
updated the setup as `exp() < -1` could be folded to False
2024-09-09 22:46:42 -04:00
qazal
935b4ddff6
use ast_const in test_linearizer asts [run_process_replay] ( #6407 )
2024-09-09 08:46:58 +08:00
George Hotz
86d34daac9
UOps.PHI -> UOps.ASSIGN [run_process_replay] ( #6383 )
2024-09-06 12:38:35 +08:00