Commit Graph

10510 Commits

Author SHA1 Message Date
George Hotz
baa2f16bff revert 2025-10-09 18:31:06 +08:00
George Hotz
ecf1477c0e merge those pattern matchers 2025-10-09 18:23:13 +08:00
George Hotz
e4186add83 group div rules 2025-10-09 18:14:52 +08:00
qazal
e0694fdb8e remove UPat.__repr__ [pr] (#12565) 2025-10-09 12:35:34 +03:00
chenyu
678f83e41b delete ShapeTracker to_valid_uop and substitute [pr] (#12563) 2025-10-09 05:06:10 -04:00
nimlgen
a11b686c71 amd: sqtt for all gfx11 (#12546)
* amd: general sqtt for gfx11

* target

* ops

* no gfx12 here
2025-10-09 17:04:06 +08:00
chenyu
a0cbbc35ad remove LLAMA_LAYERS in ci (#12562) 2025-10-09 04:46:41 -04:00
chenyu
fe94453d52 delete CONTIGUOUS with RANGE in st [pr] (#12561) 2025-10-09 04:32:31 -04:00
chenyu
f793cdeb87 clean up shape changing logic to not use st [pr] (#12560) 2025-10-09 04:13:02 -04:00
chenyu
1bcea19846 remove ShapeTracker.reduce [pr] (#12559) 2025-10-09 03:54:11 -04:00
chenyu
c1cc277fc3 don't call src[0].shape multiple times in MULTI st [pr] (#12558) 2025-10-09 03:40:17 -04:00
qazal
2551a60d97 viz: split out shape links (#12557) 2025-10-09 10:34:55 +03:00
George Hotz
e7aa26ed29 make remove bufferize fast (#12555)
* add more uop gc test

* make remove bufferize fast

* substitute is fast too

* fix tests
2025-10-09 15:20:02 +08:00
chenyu
cf8232ec6a clean up more RANGEIFY flag (#12556) 2025-10-09 03:06:48 -04:00
nimlgen
658c566e22 vars in gated_read_image_count (#12486)
* vars in gated_read_image_count

* nc
2025-10-09 14:54:15 +08:00
George Hotz
a8a9ac0e95 add more uop gc test (#12553) 2025-10-09 14:49:32 +08:00
chenyu
250f05a776 run some hashing test only on METAL (#12554)
quite slow on CPU
2025-10-09 02:39:49 -04:00
qazal
da9425c1a7 viz: sum all buffers in zoomed out memory graph (#11898)
* viz: switch to transformation matrix

* simpler axes domains

* less domain

* split loops

* flatten

* tiny rects

* solid proxy but still too big

* cache FileNotFound

* gridlines instead of padding

* not this

* like METAL -> METAL memory -> graph

* less colors

* better

* more grid work

* glitch

* clamp

* add range index

* pixel grids

* set min width

* y cords

* pruning

* test: clip in world units

* keep linear scan

* switch to interval tree

* fps counter

* work

* visible is the easiest

* shapes api

* math

* test bitgrid

* checkout

* work

* simpler

* work

* draw

* it's just a polygon

* merge polygons

* cleanup old stuff

* switch to hashmap there too

* add tooltips

* fix that

* better color

* better
2025-10-09 09:30:37 +03:00
chenyu
ae51bdd06a remove trivial use of RANGEIFY flag (#12550)
some tests need update still
2025-10-09 02:29:38 -04:00
George Hotz
80d99d52a5 reduce_unparented only checks ranges (#12548) 2025-10-09 14:14:03 +08:00
nimlgen
375ee2c576 faster backward_slice (#12515)
* not cached backward_slice

* mypy

* just speed

* faster
2025-10-09 14:12:20 +08:00
George Hotz
1dc500426e remove restrictions on range ending in indexing (#12543)
* remove restrictions on range ending in indexing

* early simplify

* Revert "early simplify"

This reverts commit 657d9972c2.

* disable const folding tests
2025-10-09 13:53:08 +08:00
chenyu
585bd95b50 fix ruff 0.14.0 [pr] (#12547) 2025-10-09 01:52:30 -04:00
qazal
6af29b913b viz: format rewrite time as a comment (#12545)
* viz: format rewrite time as a comment

* put above
2025-10-09 07:14:27 +03:00
qazal
baab7e334d put match times in viz (#12544)
* put match times in viz

* float
2025-10-09 06:56:10 +03:00
George Hotz
51420d1f99 rangeify profiling (#12540)
* clean up stable diffusion weight loading

* add profiling to run_rangeify

* fix tests
2025-10-09 11:32:34 +08:00
chenyu
43bce1f39f delete View minify [pr] (#12538) 2025-10-08 23:25:53 -04:00
qazal
9f9a8b0b5b viz: fix tiny device linking (#12541) 2025-10-09 06:25:33 +03:00
George Hotz
6e6059dde0 clean up stable diffusion weight loading (#12452) 2025-10-09 11:13:11 +08:00
chenyu
20d98b19c3 delete more unused ShapeTracker stuff (#12536) 2025-10-08 23:09:44 -04:00
qazal
bb5671a837 some more ops.py cleanups (#12525)
* remove GroupOp.Meta and st_arg

* inline axis_arg

* only allow .buffer on reshapes (or the buffer)

* gate is the other way

* still want can_pad?

* use op_in_backward_slice_with_self

* .buffer is recursive

* lint

* pathlib there
2025-10-09 06:06:44 +03:00
chenyu
be05028419 move ASSERT_MIN_STEP_TIME to compile3 (#12535)
threshold is current time +20%
2025-10-08 22:16:59 -04:00
George Hotz
615ec6acf0 refactor to apply_movement_op (#12533)
* refactor to apply_movement_op

* new pm_mops is fine

* make mypy happy

* cleanup apply_movement_op function
2025-10-09 10:16:09 +08:00
chenyu
c4732a18bd update tests that depend on SPLIT_REDUCEOP (#12534) 2025-10-08 21:53:30 -04:00
chenyu
5986d656a2 tighter ASSERT_MIN_STEP_TIME (#12531)
set to about 1.2x of actual time now
2025-10-08 21:22:54 -04:00
George Hotz
fc2bd53700 chatgpt nits (#12529)
* tsink_base wasn't needed

* nits from chatgpt
2025-10-09 07:34:44 +08:00
nimlgen
89ec2b3a74 memory: move bump allocator (#12505) 2025-10-08 23:12:04 +08:00
George Hotz
84fc34b274 tsink_base wasn't needed (#12528) 2025-10-08 22:46:06 +08:00
chenyu
28edea5d67 delete FUSE_CONV_BW (#12527) 2025-10-08 10:41:38 -04:00
George Hotz
2653147cb7 delete the lowerer (#12526) 2025-10-08 21:58:18 +08:00
George Hotz
0774575442 delete the old rangeify path and all the children stuff (#12524)
* delete the old rangeify path and all the children stuff

* remove the on_stack stuff and any retries

* don't use the p word

* Revert "remove the on_stack stuff and any retries"

This reverts commit 49a2b328b9.
2025-10-08 21:24:04 +08:00
Rudeus
a65ec5c693 fix fromarray depreceation (#12512) 2025-10-08 09:13:26 -04:00
qazal
b6835f4134 remove Ops.VIEW and related UOp methods (#12522)
* remove Ops.VIEW and related UOp methods

* update abstractions2.py

* no ShapeTrackers in abstractions2.py

* it's a size 1
2025-10-08 14:47:02 +03:00
George Hotz
3b0b3a2e64 fast RANGEIFY (#12504)
* rtoposort is fast, can replace rangeify with this

* fast rangeify

* work

* fast rangeify works for mnist

* should work

* progress

* pad fix

* FAST

* tests passing

* don't delete those shape ops

* put in rangeify map

* ending ranges fix

* tests

* mstack/mselect no hacks

* move to indexing.py

* touch up tests + add comments

* disable failing test

* actually make the file readable

* failing

* error
2025-10-08 19:38:06 +08:00
qazal
9448924d9e update gpt2 kernel count tests in CI=0 (#12523) 2025-10-08 14:29:11 +03:00
qazal
c5a1f9f5f9 no ShapeTrackers in multi.py (#12521)
* switch multi to all movement ops

* inline dvars
2025-10-08 14:04:05 +03:00
chenyu
ee0382ad99 remove ShapeTracker.invert (#12520) 2025-10-08 18:37:34 +08:00
chenyu
d5058427ea remove ShapeTracker.real_size (#12519) 2025-10-08 06:15:29 -04:00
qazal
6f26603f06 delete swizzler.py (#12518)
* delete swizzler

* remove merge_views tests

* don't need rewrites_for_views

* apply_rewrites
2025-10-08 13:02:34 +03:00
qazal
7e0b14243e delete grouper and kernelize (#12517)
* delete grouper and kernelize

* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00