George Hotz
baa2f16bff
revert
2025-10-09 18:31:06 +08:00
George Hotz
ecf1477c0e
merge those pattern matchers
2025-10-09 18:23:13 +08:00
George Hotz
e4186add83
group div rules
2025-10-09 18:14:52 +08:00
qazal
e0694fdb8e
remove UPat.__repr__ [pr] ( #12565 )
2025-10-09 12:35:34 +03:00
chenyu
678f83e41b
delete ShapeTracker to_valid_uop and substitute [pr] ( #12563 )
2025-10-09 05:06:10 -04:00
nimlgen
a11b686c71
amd: sqtt for all gfx11 ( #12546 )
...
* amd: general sqtt for gfx11
* target
* ops
* no gfx12 here
2025-10-09 17:04:06 +08:00
chenyu
a0cbbc35ad
remove LLAMA_LAYERS in ci ( #12562 )
2025-10-09 04:46:41 -04:00
chenyu
fe94453d52
delete CONTIGUOUS with RANGE in st [pr] ( #12561 )
2025-10-09 04:32:31 -04:00
chenyu
f793cdeb87
clean up shape changing logic to not use st [pr] ( #12560 )
2025-10-09 04:13:02 -04:00
chenyu
1bcea19846
remove ShapeTracker.reduce [pr] ( #12559 )
2025-10-09 03:54:11 -04:00
chenyu
c1cc277fc3
don't call src[0].shape multiple times in MULTI st [pr] ( #12558 )
2025-10-09 03:40:17 -04:00
qazal
2551a60d97
viz: split out shape links ( #12557 )
2025-10-09 10:34:55 +03:00
George Hotz
e7aa26ed29
make remove bufferize fast ( #12555 )
...
* add more uop gc test
* make remove bufferize fast
* substitute is fast too
* fix tests
2025-10-09 15:20:02 +08:00
chenyu
cf8232ec6a
clean up more RANGEIFY flag ( #12556 )
2025-10-09 03:06:48 -04:00
nimlgen
658c566e22
vars in gated_read_image_count ( #12486 )
...
* vars in gated_read_image_count
* nc
2025-10-09 14:54:15 +08:00
George Hotz
a8a9ac0e95
add more uop gc test ( #12553 )
2025-10-09 14:49:32 +08:00
chenyu
250f05a776
run some hashing test only on METAL ( #12554 )
...
quite slow on CPU
2025-10-09 02:39:49 -04:00
qazal
da9425c1a7
viz: sum all buffers in zoomed out memory graph ( #11898 )
...
* viz: switch to transformation matrix
* simpler axes domains
* less domain
* split loops
* flatten
* tiny rects
* solid proxy but still too big
* cache FileNotFound
* gridlines instead of padding
* not this
* like METAL -> METAL memory -> graph
* less colors
* better
* more grid work
* glitch
* clamp
* add range index
* pixel grids
* set min width
* y cords
* pruning
* test: clip in world units
* keep linear scan
* switch to interval tree
* fps counter
* work
* visible is the easiest
* shapes api
* math
* test bitgrid
* checkout
* work
* simpler
* work
* draw
* it's just a polygon
* merge polygons
* cleanup old stuff
* switch to hashmap there too
* add tooltips
* fix that
* better color
* better
2025-10-09 09:30:37 +03:00
chenyu
ae51bdd06a
remove trivial use of RANGEIFY flag ( #12550 )
...
some tests need update still
2025-10-09 02:29:38 -04:00
George Hotz
80d99d52a5
reduce_unparented only checks ranges ( #12548 )
2025-10-09 14:14:03 +08:00
nimlgen
375ee2c576
faster backward_slice ( #12515 )
...
* not cached backward_slice
* mypy
* just speed
* faster
2025-10-09 14:12:20 +08:00
George Hotz
1dc500426e
remove restrictions on range ending in indexing ( #12543 )
...
* remove restrictions on range ending in indexing
* early simplify
* Revert "early simplify"
This reverts commit 657d9972c2 .
* disable const folding tests
2025-10-09 13:53:08 +08:00
chenyu
585bd95b50
fix ruff 0.14.0 [pr] ( #12547 )
2025-10-09 01:52:30 -04:00
qazal
6af29b913b
viz: format rewrite time as a comment ( #12545 )
...
* viz: format rewrite time as a comment
* put above
2025-10-09 07:14:27 +03:00
qazal
baab7e334d
put match times in viz ( #12544 )
...
* put match times in viz
* float
2025-10-09 06:56:10 +03:00
George Hotz
51420d1f99
rangeify profiling ( #12540 )
...
* clean up stable diffusion weight loading
* add profiling to run_rangeify
* fix tests
2025-10-09 11:32:34 +08:00
chenyu
43bce1f39f
delete View minify [pr] ( #12538 )
2025-10-08 23:25:53 -04:00
qazal
9f9a8b0b5b
viz: fix tiny device linking ( #12541 )
2025-10-09 06:25:33 +03:00
George Hotz
6e6059dde0
clean up stable diffusion weight loading ( #12452 )
2025-10-09 11:13:11 +08:00
chenyu
20d98b19c3
delete more unused ShapeTracker stuff ( #12536 )
2025-10-08 23:09:44 -04:00
qazal
bb5671a837
some more ops.py cleanups ( #12525 )
...
* remove GroupOp.Meta and st_arg
* inline axis_arg
* only allow .buffer on reshapes (or the buffer)
* gate is the other way
* still want can_pad?
* use op_in_backward_slice_with_self
* .buffer is recursive
* lint
* pathlib there
2025-10-09 06:06:44 +03:00
chenyu
be05028419
move ASSERT_MIN_STEP_TIME to compile3 ( #12535 )
...
threshold is current time +20%
2025-10-08 22:16:59 -04:00
George Hotz
615ec6acf0
refactor to apply_movement_op ( #12533 )
...
* refactor to apply_movement_op
* new pm_mops is fine
* make mypy happy
* cleanup apply_movement_op function
2025-10-09 10:16:09 +08:00
chenyu
c4732a18bd
update tests that depend on SPLIT_REDUCEOP ( #12534 )
2025-10-08 21:53:30 -04:00
chenyu
5986d656a2
tighter ASSERT_MIN_STEP_TIME ( #12531 )
...
set to about 1.2x of actual time now
2025-10-08 21:22:54 -04:00
George Hotz
fc2bd53700
chatgpt nits ( #12529 )
...
* tsink_base wasn't needed
* nits from chatgpt
2025-10-09 07:34:44 +08:00
nimlgen
89ec2b3a74
memory: move bump allocator ( #12505 )
2025-10-08 23:12:04 +08:00
George Hotz
84fc34b274
tsink_base wasn't needed ( #12528 )
2025-10-08 22:46:06 +08:00
chenyu
28edea5d67
delete FUSE_CONV_BW ( #12527 )
2025-10-08 10:41:38 -04:00
George Hotz
2653147cb7
delete the lowerer ( #12526 )
2025-10-08 21:58:18 +08:00
George Hotz
0774575442
delete the old rangeify path and all the children stuff ( #12524 )
...
* delete the old rangeify path and all the children stuff
* remove the on_stack stuff and any retries
* don't use the p word
* Revert "remove the on_stack stuff and any retries"
This reverts commit 49a2b328b9 .
2025-10-08 21:24:04 +08:00
Rudeus
a65ec5c693
fix fromarray depreceation ( #12512 )
2025-10-08 09:13:26 -04:00
qazal
b6835f4134
remove Ops.VIEW and related UOp methods ( #12522 )
...
* remove Ops.VIEW and related UOp methods
* update abstractions2.py
* no ShapeTrackers in abstractions2.py
* it's a size 1
2025-10-08 14:47:02 +03:00
George Hotz
3b0b3a2e64
fast RANGEIFY ( #12504 )
...
* rtoposort is fast, can replace rangeify with this
* fast rangeify
* work
* fast rangeify works for mnist
* should work
* progress
* pad fix
* FAST
* tests passing
* don't delete those shape ops
* put in rangeify map
* ending ranges fix
* tests
* mstack/mselect no hacks
* move to indexing.py
* touch up tests + add comments
* disable failing test
* actually make the file readable
* failing
* error
2025-10-08 19:38:06 +08:00
qazal
9448924d9e
update gpt2 kernel count tests in CI=0 ( #12523 )
2025-10-08 14:29:11 +03:00
qazal
c5a1f9f5f9
no ShapeTrackers in multi.py ( #12521 )
...
* switch multi to all movement ops
* inline dvars
2025-10-08 14:04:05 +03:00
chenyu
ee0382ad99
remove ShapeTracker.invert ( #12520 )
2025-10-08 18:37:34 +08:00
chenyu
d5058427ea
remove ShapeTracker.real_size ( #12519 )
2025-10-08 06:15:29 -04:00
qazal
6f26603f06
delete swizzler.py ( #12518 )
...
* delete swizzler
* remove merge_views tests
* don't need rewrites_for_views
* apply_rewrites
2025-10-08 13:02:34 +03:00
qazal
7e0b14243e
delete grouper and kernelize ( #12517 )
...
* delete grouper and kernelize
* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00