qazal
390171d686
delete SAVE_SCHEDULE=1 [pr] ( #7087 )
2024-10-16 07:13:20 +03:00
George Hotz
3169cb386d
remove graph [pr] ( #7085 )
2024-10-16 11:40:07 +08:00
qazal
53586eac56
late assert post permuted assign [pr] ( #7084 )
...
* late assert post permuted assign [pr]
* a lil earlier
2024-10-16 06:26:04 +03:00
George Hotz
023b77cc6e
move MultiGraphRunner logic to GraphRunner [pr] ( #7083 )
...
* move MultiGraphRunner logic to GraphRunner [pr]
* _access_resources
2024-10-16 11:04:30 +08:00
qazal
207fbc4bc7
cleanup view on reduce [pr] ( #7081 )
2024-10-16 05:22:52 +03:00
qazal
067b35e915
add UOp.r [pr] ( #7080 )
2024-10-16 05:06:02 +03:00
George Hotz
26df50cf43
move memory_planner to memory.py [pr] ( #7079 )
2024-10-16 10:04:35 +08:00
qazal
bddba5897a
generic elementwise view rewrite rule + merge_views ( #7078 )
...
* generic elementwise view rewrite rule + merge_views [pr]
* no pr, views merge
2024-10-16 04:36:21 +03:00
qazal
fb29de6cc3
split schedule to view_left and view_right [pr] ( #7077 )
...
* split schedule to view_left and view_right [pr]
* move valid
2024-10-16 03:39:38 +03:00
chenyu
8601115976
_get_chain -> split_uop [pr] ( #7075 )
2024-10-15 17:31:25 -04:00
chenyu
e136cea027
cleanups around idx_given_valid [pr] ( #7074 )
2024-10-15 16:59:01 -04:00
qazal
545e79969f
always record matches in viz ( #7073 )
...
* always record matches in viz
* simpler
2024-10-15 23:03:12 +03:00
nimlgen
b025495e5c
fuzz nv vs cuda ( #7066 )
...
* fuzz nv vs cuda
* fixes
* smth
* um
* cmp the same
* dnrt
* correct gpfifo scan
* fix
2024-10-15 22:22:40 +03:00
qazal
8ff6514ba3
delete extra/ops.py [pr] ( #7072 )
2024-10-15 22:14:21 +03:00
qazal
09de958855
move print_diff to test/helpers ( #7071 )
2024-10-15 22:00:39 +03:00
qazal
1a45e94f5d
viz late to_json [pr] ( #7070 )
2024-10-15 21:36:45 +03:00
qazal
52d8afde2b
new viz unittests, isolate the ctx bug ( #7069 )
...
* start new test_viz
* test_rewrite_twice
* test_rewrite_with_ctx
* add back some of the old tests
* lints
2024-10-15 18:53:56 +03:00
nimlgen
9f00eacde5
nv tagged memory + resnet failed kernel ( #7061 )
...
* nv tagged memory
* linter
* metal fix?
2024-10-15 18:19:58 +03:00
hikettei
0f0c3934b1
refactor: improved the consistency of the frexp in transcendental ( #7060 )
...
* clarify the intetntion of bias
* Improved the consistency of m2
* int16
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-15 10:18:38 -04:00
chenyu
d12c87dc8e
use ubuntu-22.04 in CI ( #7068 )
...
ubuntu-latest points to 24.04 now, maybe it's this?
2024-10-15 09:44:59 -04:00
nimlgen
586ff4c910
nv record uvm mappings ( #7059 )
...
* nv record uvm mappings
* linteeer
* smth
* ooops
2024-10-15 00:12:49 +03:00
chenyu
2008bac6bf
use validhack logic to rewrite buffer idx ( #6740 )
...
* use validhack logic to rewrite buffer idx
saved a whopping one mod in the conv backward kernel...
* cleanup more
2024-10-14 16:47:31 -04:00
qazal
968a79b56c
lint viz with eslint ( #6988 )
...
* lint viz
* green
* move config
* space
* meh, laterg
2024-10-14 22:40:56 +03:00
chenyu
a99e42cf2f
clean up test_uop_symbolic.py ( #7058 )
...
enable more tests and remove dead tests
2024-10-14 15:35:58 -04:00
nimlgen
8094340221
nv print info about faults ( #7057 )
...
* nv print info about faults
* unrelated changes
* nv_gpu.GT200_DEBUGGER in mockgpu
* regen with ocrrect version
* spacing
2024-10-14 21:49:38 +03:00
chenyu
fbaab30fe3
add timing to fuzz_linearizer ( #7056 )
...
and applied smaller FUZZ_MAX_SIZE. this is getting quite slow in CI
2024-10-14 11:57:41 -04:00
chenyu
0d2462cbdf
use more resolve in View merge add [pr] ( #7055 )
2024-10-14 11:31:13 -04:00
qazal
8428244c30
gates are always bool [pr] ( #7054 )
2024-10-14 17:55:08 +03:00
qazal
7a28d50320
small st_fixup changes [pr] ( #7053 )
2024-10-14 16:53:10 +03:00
qazal
0ef186d4be
scheduler internal api cleanups [pr] ( #7052 )
...
* delete external_benchmark_ast.py [pr]
* cleanup 2
* random
2024-10-14 15:56:10 +03:00
qazal
bc95b7e422
actually use UOps.CONTIGUOUS ( #7049 )
2024-10-14 15:11:23 +03:00
George Hotz
f85c9ba00a
rewrite max to use cmplt + where ( #7037 )
2024-10-14 20:00:51 +08:00
qazal
88ce6ec69a
ASSIGN is always (target, val) ( #7048 )
2024-10-14 14:47:52 +03:00
qazal
0f71bc10cd
small changes from the lazy_pm branch [pr] ( #7047 )
2024-10-14 12:21:21 +03:00
qazal
3e795f2e52
verify_ast changes from lazy_pm [pr] ( #7045 )
2024-10-14 12:08:18 +03:00
George Hotz
b20b22a738
hotfix: add test_tiny, because many times it's what you want
2024-10-14 16:32:33 +08:00
George Hotz
c4db927c7b
touchup lowerer [pr] ( #7043 )
2024-10-14 16:13:28 +08:00
Louis Novy
2ac5aec66b
Fix exponential complexity in _is_padding_okay [pr] ( #7008 )
...
* preliminary test
* missed Optional
* don't check for cache during recursion
* match style from st_fixup... may be marginally faster?
* pathological test case: strongly connected DAG
* move to test_schedule as this isn't really a fusion
* oops this shouldn't be edited
* Revert "oops this shouldn't be edited"
This reverts commit 487cb027dc .
* Revert "move to test_schedule as this isn't really a fusion"
This reverts commit 48d8c550ce .
* move to test_schedule as this isn't really a fusion
* ok no more merge error funny business
2024-10-14 02:34:47 +03:00
chenyu
bd8ecf7fd6
remove NumNode ( #7035 )
2024-10-13 16:42:19 -04:00
chenyu
c4c806a210
generate new kernel dataset ( #7034 )
...
* generate new kernel dataset
pre req to remove NumNode
```
extra/optimization/generate_dataset.sh
gzip -k /tmp/sops
mv /tmp/sops.gz extra/datasets/
```
* fix var range in fuzz_linearizer
2024-10-13 16:19:41 -04:00
chenyu
1a27417262
remove arbitrary multiplication case ( #7033 )
...
adds the wrongly simplified kernel in test_linearizer_failures
#7019
2024-10-13 15:06:05 -04:00
chenyu
13575f080a
remove bitcast backward in function.py ( #7031 )
...
bitcast cannot backward
2024-10-13 10:08:27 -04:00
Harsh Natuskar
ace834ef7b
=docs update ( #7027 )
2024-10-13 19:39:06 +08:00
qazal
13846930cd
hotfix: extract_dataset.py ( #7029 )
2024-10-13 11:18:23 +03:00
nimlgen
942a17109a
qcom use QCOMBuffer for all allocated buffers ( #7023 )
...
* qcom use QCOMBuffer for all allocated buffers
* checks
2024-10-12 23:44:36 +03:00
chenyu
04d9b46d51
derivative of softmax is indepedent of max ( #7009 )
...
* derivative of softmax is indepedent of max
* update test
2024-10-12 15:59:23 -04:00
chenyu
cae1c41755
test case of softmax backward kernel count ( #7022 )
2024-10-12 15:46:32 -04:00
George Hotz
5ce224ceb3
handle arbitrary multiplication case ( #7019 )
...
* handle arbitrary multiplication case
* remove count restriction
2024-10-12 23:16:27 +08:00
chenyu
23faeacb23
remove outdated comments ( #7018 )
2024-10-12 10:51:07 -04:00
George Hotz
85a45164fb
remove pyint [pr] ( #7016 )
...
* remove pyint
* bump time on tp [pr]
* dont truncate in const fold
* remove dead code
* Revert "dont truncate in const fold"
This reverts commit 29c81db0f7 .
* remove define_var
2024-10-12 22:36:24 +08:00