chenyu
14524eeddc
test_image_valid.py -> test_simplify_valid_idx.py ( #6724 )
...
restructure the tests, will use the same file for non-image tests
2024-09-24 23:32:27 -04:00
qazal
e0d8685c99
test_masked_upcast_wino check device buf_max ( #6723 )
2024-09-25 11:26:53 +08:00
George Hotz
f45d178a55
hotfix: support JIT_BATCH_SIZE=0, make that the default
2024-09-25 10:36:04 +08:00
George Hotz
52e7f1c108
add new model CI
2024-09-25 10:23:06 +08:00
ttomsa
76bd4c7d5f
advanced setitem ( #6262 )
...
* advanced setitem draft
* add setitem tests
* fix for tests
* small change
* handle repeated indices with test
* fix v broadcasting to mask
* clean up a bit
* open more tests
* clean up, fixes issue with scalar tensor index
* fix
* fix index_put_ and linter
* add type annotation
* done
* remove non contiguous hack
* woops linter
* name fix
* add back type notation
* more type notation
* final
* linter
* check lazydata not shared
* no numpy
* no numpy
* rename
* index benchmark
* linter
* no cloning time
* rm benchmark
* new function
* rm contiguous and cast early
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-09-24 22:14:59 -04:00
qazal
3bf25aae78
start work on global buffer count limit [run_process_replay] ( #6722 )
...
* add a bufs_max option
* simple spec
2024-09-25 09:51:56 +08:00
George Hotz
b0ffe2452b
bump line count to 9800
2024-09-25 09:15:30 +08:00
chenyu
5c240c34aa
split validhack into simplify idx and drop valids ( #6719 )
...
* split validhack into simplify idx and drop valids
will be using the simplify idx for non-image buffer
[run_process_replay]
* shorter
2024-09-24 09:40:27 -04:00
qazal
cefc3e9382
make all schedules immutable [run_process_replay] ( #6718 )
...
* compute inputs and outputs in LBScheduleItem [run_process_replay]
* simpler metadata, delete __hash__
* no dynamic field
* test_diff_schedule
2024-09-24 21:08:16 +08:00
qazal
29330014ab
give FUZZ_SCHEDULE views a base ( #6717 )
...
* memoryview to bytes
* give FUZZ_SCHEDULE views a base
2024-09-24 19:20:37 +08:00
nimlgen
f0019ad29c
bump ci test timeout for test_speed_exec_time ( #6715 )
...
* bump ci test timeout for test_speed_exec_time
* more
2024-09-24 18:44:09 +08:00
qazal
1c03fb69c9
viz dedup assert groupby ctx [run_process_replay] ( #6714 )
2024-09-24 18:17:21 +08:00
chenyu
8d75326cb5
do not fold var with min==max ( #6713 )
...
not really used, want it to keep as a var for valid simplification
[run_process_replay]
2024-09-24 06:16:34 -04:00
chenyu
9e51879019
fix idx setup in image_valid test_openpilot_conv3 ( #6710 )
...
* fix idx setup in image_valid test_openpilot_conv3
* corrected output and sad
2024-09-24 05:49:04 -04:00
qazal
ae3f3fec38
refactor DEFINE_GLOBAL inputs to list [run_process_replay] ( #6711 )
2024-09-24 17:43:24 +08:00
wozeparrot
f932116e05
feat: small things from default_threefry ( #6708 )
2024-09-24 17:00:47 +08:00
chenyu
f2700ac58a
construct a candidate set to attempt valid idx rewrite ( #6706 )
...
preparation for the brute force attempt for some valids
2024-09-24 04:12:21 -04:00
wozeparrot
2be0b26a1f
rand only supports single device ( #6682 )
2024-09-24 16:07:44 +08:00
nimlgen
75b7627db7
qcom do not recreate memoryviews on updates ( #6701 )
2024-09-24 15:36:22 +08:00
chenyu
a6078c099f
simpler idx rewrite structure in simplify_valid_image_load ( #6704 )
...
express valid into things to check when rewriting idx. it's the same for single clause or a simplex
[run_process_replay]
2024-09-24 03:35:39 -04:00
nimlgen
d3ed50c769
fix typo in 'Too many resources requested for launch' ( #6705 )
2024-09-24 15:33:01 +08:00
wozeparrot
ef7a74bfa0
feat: use /raid/downloads on tinybox ( #6702 )
2024-09-24 15:26:31 +08:00
nimlgen
ca66b11e07
qcom fix disasm ( #6703 )
2024-09-24 15:23:43 +08:00
nimlgen
a473bf4ba9
do not always update float dims ( #6699 )
...
* do not always update float dims
* linter
* isinsatcen
2024-09-24 14:40:45 +08:00
qazal
048483ee0b
viz fold const nodes and UOp/float4 syntax highlight ( #6695 )
...
* fold const nodes
* show rewrite count
* hotfix: cpp
* more syntax highlight
* custom language definitions
* only cpp
* small fixups for UPat
* extend python
* cleanups
* rewrites helper
* better message
2024-09-24 14:36:59 +08:00
chenyu
4bb1694f49
more tests about bounds of UOp divs ( #6700 )
2024-09-24 00:41:43 -04:00
chenyu
79aef64d70
update tests in test_image_valid ( #6698 )
2024-09-24 00:04:21 -04:00
Anurag Lamsal
568757e087
fix model_eval.py in the mlperf folder searching for bert vocab in the wrong directory ( #6649 )
2024-09-24 11:20:44 +08:00
chenyu
4a2fa0b627
clean up apply OptOps.PADTO [run_process_replay] ( #6694 )
2024-09-23 23:13:50 -04:00
chenyu
f703180356
hotfix missed cast in cstyle code_for_workitem ( #6693 )
...
`NOLOCALS=1 python -c "from tinygrad import Tensor; Tensor.randn((5, 5)).realize()"` works on green box with this fix #6687
2024-09-23 22:18:18 -04:00
samm393
19c11792fd
Flux.1 ( #6334 )
...
* initial commit
* whitespace
* get rid of torch import
* indentation
* less hardcoding
* add flux.1-dev
* jit
* no double
* t5 tidy up
* validation image
* reuse sdxl autoencoder
* typing changes
* empty lines
* remove unneeded comments
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-09-24 10:08:04 +08:00
chenyu
31b9c74c77
tiny import cleanup and fix typo ( #6692 )
2024-09-23 21:48:23 -04:00
qazal
02c0c09fb9
VIZ syntax highlighting and new colors ( #6686 )
...
* VIZ syntax highlighting
* more work
2024-09-24 09:41:07 +08:00
ignaciosica
0ffbd75af8
Refactor TC [run_process_replay] ( #6456 )
...
* unify _apply_tc_opt
* refactor tc pt2
* hotfix: remove blank line
* refactor upcast_axes
* simplify check before using tensor_cores
* rename upcast_axes
* fix amx and remove counting hack
* AMX cleanup
* hotfix: bug
* skip hand-coded TC opts if AMX to also skip if emulating
* hotfix: AMX bug
* hotfix: AMX tests
* minor format change
* hotfix: minor var name change
* hotfix: minor refactor
* hotfix: hand-coded tc bug
* hotfix: simple change
* fix comment
* hotfix: refactor attempt to local N
* hotfix: AMD TC spacing
* refactor tensor core options in kernel.py to include opt order
* hotfix: add comments to TensorCore dataclass
* hotfix: improve comment on TC dataclas
* hotfix: refactor opt_seq loop
* hotfix: add comments in hand-coded TC opts
* hotfix: upcast_axes comment
* hotfix: remove unroll from opt_seq
* hotfix: bug + remove unroll from opt_seq
* hotfix: rename opt_seq into opts_seq
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-09-24 09:05:29 +08:00
George Hotz
b9e6d42a1f
Revert "gated native math in OpenCL ( #6683 )" ( #6691 )
...
This reverts commit 2fe3eeed17 .
2024-09-24 08:48:10 +08:00
Harald Schäfer
382938ab41
Add command to show default backend in README ( #6688 )
...
* Update README.md
* Update README.md
* Update README.md
2024-09-24 08:42:18 +08:00
George Hotz
46fab1f185
hotfix: curved edges in viz
2024-09-23 19:45:35 +08:00
qazal
ee050d31d7
viz more touchups ( #6685 )
...
* dont print if we're running VIZ
* 242424
2024-09-23 19:44:28 +08:00
George Hotz
2fe3eeed17
gated native math in OpenCL ( #6683 )
...
* gated native math
* Update cstyle.py
2024-09-23 19:22:13 +08:00
George Hotz
84072166db
move mul consts like add consts ( #6684 )
2024-09-23 19:21:53 +08:00
George Hotz
de259e3f09
hotfix: add compile3 to comma CI
2024-09-23 18:25:49 +08:00
George Hotz
7c38121280
load penalty ( #6681 )
...
* bias/bn loads after loops
* load penalty in fix_priority
* more generic test
2024-09-23 18:12:12 +08:00
George Hotz
431ffc4254
hotfix: delete float16 failing
2024-09-23 17:42:57 +08:00
qazal
aad7c9c883
viz adjustable metadata ( #6679 )
...
* move from grid to flexbox
* viz adjustable metadata
* w-size
2024-09-23 17:31:51 +08:00
George Hotz
2f2f933e50
fix buffer shape regression from onnx ( #6678 )
2024-09-23 16:58:42 +08:00
qazal
b438e3cc19
viz bugfix click in middle of UOps ( #6676 )
2024-09-23 16:44:19 +08:00
chenyu
f55459c98e
failed validhack test for a 0.9.7 conv ( #6677 )
2024-09-23 04:43:47 -04:00
nimlgen
94cbb1cd32
qcom image copyout ( #6667 )
...
* qcom copyout
* copyin
* linter
* fix
* linter
* myoy
2024-09-23 16:11:43 +08:00
George Hotz
417a19a292
uop priority inversion ( #6670 )
...
* make checks simpler [run_process_replay]
* reorder uops
* fix inversion [run_process_replay]
* no need to move SPECIALs
* Update uopgraph.py
2024-09-23 15:53:53 +08:00
qazal
49bf92afa2
schedule UOps.ASSIGN ( #6661 )
2024-09-23 15:44:12 +08:00