chenyu
|
5c240c34aa
|
split validhack into simplify idx and drop valids (#6719)
* split validhack into simplify idx and drop valids
will be using the simplify idx for non-image buffer
[run_process_replay]
* shorter
|
2024-09-24 09:40:27 -04:00 |
|
qazal
|
cefc3e9382
|
make all schedules immutable [run_process_replay] (#6718)
* compute inputs and outputs in LBScheduleItem [run_process_replay]
* simpler metadata, delete __hash__
* no dynamic field
* test_diff_schedule
|
2024-09-24 21:08:16 +08:00 |
|
qazal
|
29330014ab
|
give FUZZ_SCHEDULE views a base (#6717)
* memoryview to bytes
* give FUZZ_SCHEDULE views a base
|
2024-09-24 19:20:37 +08:00 |
|
nimlgen
|
f0019ad29c
|
bump ci test timeout for test_speed_exec_time (#6715)
* bump ci test timeout for test_speed_exec_time
* more
|
2024-09-24 18:44:09 +08:00 |
|
qazal
|
1c03fb69c9
|
viz dedup assert groupby ctx [run_process_replay] (#6714)
|
2024-09-24 18:17:21 +08:00 |
|
chenyu
|
8d75326cb5
|
do not fold var with min==max (#6713)
not really used, want it to keep as a var for valid simplification
[run_process_replay]
|
2024-09-24 06:16:34 -04:00 |
|
chenyu
|
9e51879019
|
fix idx setup in image_valid test_openpilot_conv3 (#6710)
* fix idx setup in image_valid test_openpilot_conv3
* corrected output and sad
|
2024-09-24 05:49:04 -04:00 |
|
qazal
|
ae3f3fec38
|
refactor DEFINE_GLOBAL inputs to list [run_process_replay] (#6711)
|
2024-09-24 17:43:24 +08:00 |
|
wozeparrot
|
f932116e05
|
feat: small things from default_threefry (#6708)
|
2024-09-24 17:00:47 +08:00 |
|
chenyu
|
f2700ac58a
|
construct a candidate set to attempt valid idx rewrite (#6706)
preparation for the brute force attempt for some valids
|
2024-09-24 04:12:21 -04:00 |
|
wozeparrot
|
2be0b26a1f
|
rand only supports single device (#6682)
|
2024-09-24 16:07:44 +08:00 |
|
nimlgen
|
75b7627db7
|
qcom do not recreate memoryviews on updates (#6701)
|
2024-09-24 15:36:22 +08:00 |
|
chenyu
|
a6078c099f
|
simpler idx rewrite structure in simplify_valid_image_load (#6704)
express valid into things to check when rewriting idx. it's the same for single clause or a simplex
[run_process_replay]
|
2024-09-24 03:35:39 -04:00 |
|
nimlgen
|
d3ed50c769
|
fix typo in 'Too many resources requested for launch' (#6705)
|
2024-09-24 15:33:01 +08:00 |
|
wozeparrot
|
ef7a74bfa0
|
feat: use /raid/downloads on tinybox (#6702)
|
2024-09-24 15:26:31 +08:00 |
|
nimlgen
|
ca66b11e07
|
qcom fix disasm (#6703)
|
2024-09-24 15:23:43 +08:00 |
|
nimlgen
|
a473bf4ba9
|
do not always update float dims (#6699)
* do not always update float dims
* linter
* isinsatcen
|
2024-09-24 14:40:45 +08:00 |
|
qazal
|
048483ee0b
|
viz fold const nodes and UOp/float4 syntax highlight (#6695)
* fold const nodes
* show rewrite count
* hotfix: cpp
* more syntax highlight
* custom language definitions
* only cpp
* small fixups for UPat
* extend python
* cleanups
* rewrites helper
* better message
|
2024-09-24 14:36:59 +08:00 |
|
chenyu
|
4bb1694f49
|
more tests about bounds of UOp divs (#6700)
|
2024-09-24 00:41:43 -04:00 |
|
chenyu
|
79aef64d70
|
update tests in test_image_valid (#6698)
|
2024-09-24 00:04:21 -04:00 |
|
Anurag Lamsal
|
568757e087
|
fix model_eval.py in the mlperf folder searching for bert vocab in the wrong directory (#6649)
|
2024-09-24 11:20:44 +08:00 |
|
chenyu
|
4a2fa0b627
|
clean up apply OptOps.PADTO [run_process_replay] (#6694)
|
2024-09-23 23:13:50 -04:00 |
|
chenyu
|
f703180356
|
hotfix missed cast in cstyle code_for_workitem (#6693)
`NOLOCALS=1 python -c "from tinygrad import Tensor; Tensor.randn((5, 5)).realize()"` works on green box with this fix #6687
|
2024-09-23 22:18:18 -04:00 |
|
samm393
|
19c11792fd
|
Flux.1 (#6334)
* initial commit
* whitespace
* get rid of torch import
* indentation
* less hardcoding
* add flux.1-dev
* jit
* no double
* t5 tidy up
* validation image
* reuse sdxl autoencoder
* typing changes
* empty lines
* remove unneeded comments
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
|
2024-09-24 10:08:04 +08:00 |
|
chenyu
|
31b9c74c77
|
tiny import cleanup and fix typo (#6692)
|
2024-09-23 21:48:23 -04:00 |
|
qazal
|
02c0c09fb9
|
VIZ syntax highlighting and new colors (#6686)
* VIZ syntax highlighting
* more work
|
2024-09-24 09:41:07 +08:00 |
|
ignaciosica
|
0ffbd75af8
|
Refactor TC [run_process_replay] (#6456)
* unify _apply_tc_opt
* refactor tc pt2
* hotfix: remove blank line
* refactor upcast_axes
* simplify check before using tensor_cores
* rename upcast_axes
* fix amx and remove counting hack
* AMX cleanup
* hotfix: bug
* skip hand-coded TC opts if AMX to also skip if emulating
* hotfix: AMX bug
* hotfix: AMX tests
* minor format change
* hotfix: minor var name change
* hotfix: minor refactor
* hotfix: hand-coded tc bug
* hotfix: simple change
* fix comment
* hotfix: refactor attempt to local N
* hotfix: AMD TC spacing
* refactor tensor core options in kernel.py to include opt order
* hotfix: add comments to TensorCore dataclass
* hotfix: improve comment on TC dataclas
* hotfix: refactor opt_seq loop
* hotfix: add comments in hand-coded TC opts
* hotfix: upcast_axes comment
* hotfix: remove unroll from opt_seq
* hotfix: bug + remove unroll from opt_seq
* hotfix: rename opt_seq into opts_seq
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
|
2024-09-24 09:05:29 +08:00 |
|
George Hotz
|
b9e6d42a1f
|
Revert "gated native math in OpenCL (#6683)" (#6691)
This reverts commit 2fe3eeed17.
|
2024-09-24 08:48:10 +08:00 |
|
Harald Schäfer
|
382938ab41
|
Add command to show default backend in README (#6688)
* Update README.md
* Update README.md
* Update README.md
|
2024-09-24 08:42:18 +08:00 |
|
George Hotz
|
46fab1f185
|
hotfix: curved edges in viz
|
2024-09-23 19:45:35 +08:00 |
|
qazal
|
ee050d31d7
|
viz more touchups (#6685)
* dont print if we're running VIZ
* 242424
|
2024-09-23 19:44:28 +08:00 |
|
George Hotz
|
2fe3eeed17
|
gated native math in OpenCL (#6683)
* gated native math
* Update cstyle.py
|
2024-09-23 19:22:13 +08:00 |
|
George Hotz
|
84072166db
|
move mul consts like add consts (#6684)
|
2024-09-23 19:21:53 +08:00 |
|
George Hotz
|
de259e3f09
|
hotfix: add compile3 to comma CI
|
2024-09-23 18:25:49 +08:00 |
|
George Hotz
|
7c38121280
|
load penalty (#6681)
* bias/bn loads after loops
* load penalty in fix_priority
* more generic test
|
2024-09-23 18:12:12 +08:00 |
|
George Hotz
|
431ffc4254
|
hotfix: delete float16 failing
|
2024-09-23 17:42:57 +08:00 |
|
qazal
|
aad7c9c883
|
viz adjustable metadata (#6679)
* move from grid to flexbox
* viz adjustable metadata
* w-size
|
2024-09-23 17:31:51 +08:00 |
|
George Hotz
|
2f2f933e50
|
fix buffer shape regression from onnx (#6678)
|
2024-09-23 16:58:42 +08:00 |
|
qazal
|
b438e3cc19
|
viz bugfix click in middle of UOps (#6676)
|
2024-09-23 16:44:19 +08:00 |
|
chenyu
|
f55459c98e
|
failed validhack test for a 0.9.7 conv (#6677)
|
2024-09-23 04:43:47 -04:00 |
|
nimlgen
|
94cbb1cd32
|
qcom image copyout (#6667)
* qcom copyout
* copyin
* linter
* fix
* linter
* myoy
|
2024-09-23 16:11:43 +08:00 |
|
George Hotz
|
417a19a292
|
uop priority inversion (#6670)
* make checks simpler [run_process_replay]
* reorder uops
* fix inversion [run_process_replay]
* no need to move SPECIALs
* Update uopgraph.py
|
2024-09-23 15:53:53 +08:00 |
|
qazal
|
49bf92afa2
|
schedule UOps.ASSIGN (#6661)
|
2024-09-23 15:44:12 +08:00 |
|
George Hotz
|
9f1f445a5f
|
reorder uops (#6672)
|
2024-09-23 15:21:59 +08:00 |
|
qazal
|
e2d6e10ddf
|
hotfix: reset benchmarks cache for process replay (#6671)
|
2024-09-23 15:13:02 +08:00 |
|
chenyu
|
0362dbbbe8
|
relax idx simplification given valid (#6669)
apply to kernels in op 0.9.7.
if a valid has a complicated expr, we cannot drop valid but it's possible to simplify idx given valid
|
2024-09-23 03:04:57 -04:00 |
|
qazal
|
7ca9ffa494
|
misc UOp st cleanups (#6668)
|
2024-09-23 14:16:42 +08:00 |
|
chenyu
|
26ebb7cab4
|
don't use div_folding in lt_folding (#6666)
* don't use div_folding in lt_folding
valids 35 -> 13
* fails the same as before
|
2024-09-23 01:50:18 -04:00 |
|
qazal
|
e9248b9e27
|
viz highlight new nodes (#6665)
* p2
* ret adds and dels
* maybe that way
* add additions
* simpler test_viz
|
2024-09-23 13:46:18 +08:00 |
|
chenyu
|
da5b741656
|
removed valid in openpilot conv (#6619)
35 valids left
|
2024-09-23 00:30:18 -04:00 |
|