chenyu
f890d1cbbd
remove PUSH_PERMUTES from external_test_opt ( #7232 )
...
remove old comments and update kernel count for test_convnext
2024-10-23 00:11:34 -04:00
qazal
dae908299e
full_ast_rewrite api with ScheduleItemContext ( #7223 )
2024-10-22 23:17:05 +03:00
chenyu
ea016b55d1
don't throw in fuzz_linearizer ( #7148 )
...
already broken on master and needs fix. don't throw to not block other pr
2024-10-18 09:28:30 -04:00
nimlgen
45db7d9045
fuzz qcom vs opencl ( #7130 )
...
* fuzz qcom vs opencl
* fix nv
* bettre?
* typo
* open both devs
2024-10-17 18:49:08 +03:00
George Hotz
ded1b38b84
minor dtype cleanup [pr] ( #7124 )
...
* minor dtype cleanup [pr]
* use ptr() function
2024-10-17 17:41:23 +08:00
nimlgen
39ab67e9ef
beam capture and replay in fuzz ( #7099 )
...
* beam capture and reply in fuzz
* clean a bit
2024-10-16 20:26:58 +03:00
qazal
40f33c110b
big graph var_vals as rewrite context ( #7007 )
...
* var_vals as rewrite context
* no default arg
* add st var_vals
* delete some stuff
* add the rewrite rule again
* extra
* this whole part is preschedule
* test with a second context
* redo
* i always forget tensor variable
2024-10-16 07:31:44 +03:00
qazal
390171d686
delete SAVE_SCHEDULE=1 [pr] ( #7087 )
2024-10-16 07:13:20 +03:00
George Hotz
3169cb386d
remove graph [pr] ( #7085 )
2024-10-16 11:40:07 +08:00
nimlgen
b025495e5c
fuzz nv vs cuda ( #7066 )
...
* fuzz nv vs cuda
* fixes
* smth
* um
* cmp the same
* dnrt
* correct gpfifo scan
* fix
2024-10-15 22:22:40 +03:00
qazal
09de958855
move print_diff to test/helpers ( #7071 )
2024-10-15 22:00:39 +03:00
chenyu
fbaab30fe3
add timing to fuzz_linearizer ( #7056 )
...
and applied smaller FUZZ_MAX_SIZE. this is getting quite slow in CI
2024-10-14 11:57:41 -04:00
qazal
0ef186d4be
scheduler internal api cleanups [pr] ( #7052 )
...
* delete external_benchmark_ast.py [pr]
* cleanup 2
* random
2024-10-14 15:56:10 +03:00
chenyu
bd8ecf7fd6
remove NumNode ( #7035 )
2024-10-13 16:42:19 -04:00
chenyu
c4c806a210
generate new kernel dataset ( #7034 )
...
* generate new kernel dataset
pre req to remove NumNode
```
extra/optimization/generate_dataset.sh
gzip -k /tmp/sops
mv /tmp/sops.gz extra/datasets/
```
* fix var range in fuzz_linearizer
2024-10-13 16:19:41 -04:00
qazal
13846930cd
hotfix: extract_dataset.py ( #7029 )
2024-10-13 11:18:23 +03:00
George Hotz
38d45dfba5
hotfix: no rng in test/external/external_benchmark_schedule.py
2024-10-12 22:03:04 +08:00
George Hotz
a71bb09ec3
remove symbolic file [pr] ( #7012 )
2024-10-12 18:44:44 +08:00
George Hotz
5ae2de9845
UOp.variable ( #7010 )
...
* UOp.variable [pr]
* fix tests
* clean
* improve name rendering
* last bug
2024-10-12 18:20:44 +08:00
George Hotz
e7a0ffe46a
break out linearization [pr] ( #6994 )
2024-10-11 15:27:33 +08:00
George Hotz
e441794c4b
remove custom op support, we waste time maintaining this ( #6991 )
...
* remove custom op support, we waste time maintaining this
* customop is over
2024-10-11 14:31:09 +08:00
George Hotz
c08521e823
minor cleanups from toonygrad ( #6990 )
2024-10-11 14:19:10 +08:00
qazal
20d3c2d113
unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW ( #6955 )
...
* add UOps.VIEW
* update hardcoded asts
* update sops.gz
2024-10-09 02:00:17 +08:00
qazal
2800520dd5
even smaller process_replay.py [pr] ( #6941 )
...
* even smaller process_replay.py [pr]
* delete those tests
* dedup asts
2024-10-08 20:43:22 +08:00
qazal
b82023c97e
process replay cleanup to generic _pmap [pr] ( #6929 )
...
* process replay cleanup to generic _pmap [pr]
* delete `COMPARE_SCHEDULE`
2024-10-07 13:57:05 +08:00
qazal
16312b4c59
rip out old scheduler process replay stuff, diff pure UOps [pr] ( #6927 )
2024-10-07 13:20:35 +08:00
George Hotz
4df5c7a4ef
move lazy to engine [pr] ( #6886 )
...
* move lazy to engine [pr]
* engine.lazy
2024-10-04 23:19:26 +08:00
George Hotz
8ca506ee37
remove the magic methods for moving between devices [pr] ( #6881 )
...
* remove the magic methods for moving between devices [pr]
* remove unneeded clang
2024-10-04 20:27:52 +08:00
George Hotz
f4ec39fe58
switch symbolic from old to uops, final PR ( #6872 )
...
* switch symbolic from old to uops, final PR
* two wrong answers
* not needed resolves
* symbolic ops passes
* symbolic ops passes
* progress
* tests pass (almost)
* fix last test
* fix some tests
* global binding and unbinding
* Revert "global binding and unbinding"
This reverts commit 9456725630 .
* that test works now
* vars on uop doesn't recurse
* fix fuzzer
* update
* fix type
* fix gpt, it's UOp now
* ssimplify symbolics
2024-10-04 16:42:27 +08:00
George Hotz
738a5794a9
last update for new symbolic [pr] ( #6877 )
2024-10-04 14:58:51 +08:00
qazal
17068410e6
give EXT schedules metadata [pr] ( #6865 )
2024-10-03 20:14:18 +08:00
qazal
c5b252cdb3
add pr alias [pr] ( #6834 )
2024-10-01 18:48:44 +08:00
qazal
a16a8c5958
color process replay stats [run_process_replay] ( #6830 )
2024-10-01 15:29:11 +08:00
George Hotz
d726eb6f48
uop resolve [run_process_replay] ( #6826 )
...
* uop bool and int and stuff [run_process_replay]
* add ne support
* can't even be None anymore
* BinaryOps.AND support
* less compare
2024-10-01 13:11:42 +08:00
George Hotz
9dd9f71011
no global kernel stuff [run_process_replay] ( #6808 )
...
* use traceback instead of global metadata crap [run_process_replay]
* save the kernel
* correct, imports clean, no device
* UNPARENTED
* speed
* proudly unparented
* Update ops.py
* update tests for unparented
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-09-30 13:52:33 +08:00
qazal
12e4a4900a
hotfix: missing return in METAL dm benchmark ( #6749 )
2024-09-26 09:12:38 +08:00
qazal
b629a7998d
early assert buffer count limit [run_process_replay] ( #6746 )
...
* better error message for buffer count limit [run_process_replay]
* 3.9 needs that
* assert ScheduleItem
* new _test_buf_cnt
2024-09-26 08:24:26 +08:00
wozeparrot
c100f3d406
default threefry ( #6116 )
2024-09-25 17:45:13 +08:00
qazal
5ad2f95d01
process replay diff stats ( #6736 )
...
* process replay diff stats
* fix tuples
2024-09-25 15:19:56 +08:00
qazal
cefc3e9382
make all schedules immutable [run_process_replay] ( #6718 )
...
* compute inputs and outputs in LBScheduleItem [run_process_replay]
* simpler metadata, delete __hash__
* no dynamic field
* test_diff_schedule
2024-09-24 21:08:16 +08:00
qazal
29330014ab
give FUZZ_SCHEDULE views a base ( #6717 )
...
* memoryview to bytes
* give FUZZ_SCHEDULE views a base
2024-09-24 19:20:37 +08:00
George Hotz
431ffc4254
hotfix: delete float16 failing
2024-09-23 17:42:57 +08:00
qazal
d24e4b1042
viz more kernel view work ( #6659 )
2024-09-23 10:48:35 +08:00
qazal
6be1bf09f1
hotfix: bring COMPARE_SCHEDULE=0 back ( #6657 )
2024-09-23 10:39:43 +08:00
qazal
6b65d8c461
more process replay tracing work [run_process_replay] ( #6650 )
2024-09-22 16:16:58 +08:00
qazal
5bafed2f88
process replay traceback ( #6642 )
2024-09-21 16:53:34 +08:00
chenyu
acef3e67fa
add an example that idx is const and valid cannot be removed ( #6625 )
...
very weird
2024-09-20 05:46:27 -04:00
George Hotz
d4b662c318
new openpilot compile ( #6573 )
...
* new openpilot compile
* note, copyout doesn't work for images
2024-09-18 14:22:50 +08:00
chenyu
c3a70dbf0d
20 jitted steps in openpilot benchmark ( #6577 )
2024-09-18 02:15:16 -04:00
qazal
d8e5d5c663
move VIZ=1 tests to fuzzers ( #6574 )
2024-09-18 12:12:03 +08:00