Commit Graph

871 Commits

Author SHA1 Message Date
George Hotz
b9eb5b5d49 clean up the LLM tokenizer (#12653)
* clean up the LLM tokenizer

* simple tokenizer is actually simple

* ugh write good code
2025-10-14 14:22:01 +08:00
wozeparrot
47e0c43976 feat: Tensor.{load, store} (#12629) 2025-10-13 08:04:41 -07:00
Sieds Lykles
e0139fafc1 UOp symbolic tests use eval to check against string (#12643) 2025-10-13 14:19:42 +02:00
qazal
b5afa3848e viz: fix memory graph total nbytes (#12622)
* viz: fix memory graph total nbytes

* post increment

* simple regression test

* loop with markers + slightly off text baseline

* cpu events clear
2025-10-12 14:32:46 +03:00
Sieds Lykles
772a8dfe31 reshape uses valid when simplifying (#12597)
* reshape uses valid when simplifying

* try with IGNORE_OOB=0

* is it this test?

* skipif gpuocelot
2025-10-11 17:02:54 +02:00
Sieds Lykles
a2ae56674a uop_given_valid try multiple clauses (#12615)
* uop_given_valid uses less simplify

* enable test

* try all expressions together

* enable test
2025-10-11 11:53:42 +02:00
chenyu
03ef5197fc move get_contraction to helpers [pr] (#12594) 2025-10-10 04:28:57 -04:00
chenyu
af90dc00de remove some View add logic [pr] (#12584)
no longer simplify the case of v0+v1 where v0 has a mask
2025-10-10 03:47:56 -04:00
chenyu
c8dfd10257 ShapeTracker.real_strides -> is_expanded [pr] (#12579)
only keep the used part
2025-10-09 22:52:45 -04:00
chenyu
678f83e41b delete ShapeTracker to_valid_uop and substitute [pr] (#12563) 2025-10-09 05:06:10 -04:00
chenyu
cf8232ec6a clean up more RANGEIFY flag (#12556) 2025-10-09 03:06:48 -04:00
chenyu
250f05a776 run some hashing test only on METAL (#12554)
quite slow on CPU
2025-10-09 02:39:49 -04:00
chenyu
ae51bdd06a remove trivial use of RANGEIFY flag (#12550)
some tests need update still
2025-10-09 02:29:38 -04:00
chenyu
43bce1f39f delete View minify [pr] (#12538) 2025-10-08 23:25:53 -04:00
chenyu
20d98b19c3 delete more unused ShapeTracker stuff (#12536) 2025-10-08 23:09:44 -04:00
George Hotz
0774575442 delete the old rangeify path and all the children stuff (#12524)
* delete the old rangeify path and all the children stuff

* remove the on_stack stuff and any retries

* don't use the p word

* Revert "remove the on_stack stuff and any retries"

This reverts commit 49a2b328b9.
2025-10-08 21:24:04 +08:00
qazal
b6835f4134 remove Ops.VIEW and related UOp methods (#12522)
* remove Ops.VIEW and related UOp methods

* update abstractions2.py

* no ShapeTrackers in abstractions2.py

* it's a size 1
2025-10-08 14:47:02 +03:00
George Hotz
3b0b3a2e64 fast RANGEIFY (#12504)
* rtoposort is fast, can replace rangeify with this

* fast rangeify

* work

* fast rangeify works for mnist

* should work

* progress

* pad fix

* FAST

* tests passing

* don't delete those shape ops

* put in rangeify map

* ending ranges fix

* tests

* mstack/mselect no hacks

* move to indexing.py

* touch up tests + add comments

* disable failing test

* actually make the file readable

* failing

* error
2025-10-08 19:38:06 +08:00
chenyu
ee0382ad99 remove ShapeTracker.invert (#12520) 2025-10-08 18:37:34 +08:00
chenyu
d5058427ea remove ShapeTracker.real_size (#12519) 2025-10-08 06:15:29 -04:00
qazal
2e19354c1c viz: reorder timeline graphs (#12498)
* viz: reorder timeline graphs

* update test_viz with the new order
2025-10-08 07:10:23 +03:00
Sieds Lykles
b465c17b56 Revert "UOp.factor and add chain sorting (#12413)" (#12492)
This reverts commit e74be4a140.
2025-10-08 03:20:23 +02:00
George Hotz
945cc46475 delete children tracking from uop (#12491)
* delete children tracking from uop

* uop children no longer exists

* no tracked children

* that test is flaky too
2025-10-08 09:04:14 +08:00
chenyu
a5484b767e remove skipping cast in simplify_valid [pr] (#12472)
* remove skipping cast in simplify_valid [pr]

unsupported statements are handled in uop_given_valid already. the test failed because (100%x) somehow got simplified

* better test
2025-10-07 00:10:04 -04:00
qazal
f664bcc8bd use recursive_property in UOp tracing (#12469)
* test

* simple passing
2025-10-06 21:10:52 +03:00
Sieds Lykles
e74be4a140 UOp.factor and add chain sorting (#12413)
* add ordering

* fix some tests

* fix more tests

* shorten comment

* update test

* add rule and test

* add rule and test

* remove check

* use fold_divmod_congruence instead of simplify

* adjust tests

* shorten line

* new algo

* add test

* add function to un-nest the div

* add UOp.factor

* test UOp.factor

* uop_given_valid tries to factor simplex expression

* shorten line

* symbolic_flat is back

* change that back

* fix those new tests

* new rule for ordering

* factor multiple factors

* no symbolic_flat

* symbolic_flat to there

* move that back

* fix imports

* merge correctly

* linter happy

* add rule

* add a test

* cleanup

* revert that for now

* UOp.factor returns self instead of None

* try all_candidates

* remove or_else

* post index symbolic

* add test

* maket this closer to the original

* increase mac hlb_cifar min step time

* add some ordering tests

* cleanup

* increase pytest timeout time

* check dtype
2025-10-04 06:05:38 +02:00
George Hotz
c7849ac593 fix test lil model (#12437)
* fix test lil model

* 4 not 3
2025-10-03 02:28:37 -04:00
chenyu
bf99de7b1e update a few more tests for RANGEIFY (#12434) 2025-10-03 00:16:58 -04:00
Sieds Lykles
16a65b4fd0 fix test_symbolic_gcd_div hang (#12427) 2025-10-03 04:21:16 +02:00
chenyu
7b3912d8e4 relax atol for some tests (#12422) 2025-10-02 05:04:44 -04:00
George Hotz
583553f467 split ranges (#12411)
* split ranges

* simpler

* split ranges

* range str

* fix test

* oops

* faster

* no group 2

* tests

* dont_sub_ranges_for_image

* revert that
2025-10-02 12:57:22 +08:00
nimlgen
3e0e0290ce increase timeout in test_module_runs (#12408) 2025-10-01 22:01:44 +03:00
b1tg
57ad46c6e4 rangeify: increase atol for test_two_binops_no_rerun passing on real windows machine (#12389)
CPU_LLVM=1
2025-10-01 00:56:45 -04:00
chenyu
0662946fac atol in test_two_binops_no_rerun (#12387)
for RANGEIFY LLVM
2025-10-01 00:05:47 -04:00
wozeparrot
4204edc60b feat: skip test_long (#12383) 2025-09-30 20:07:39 -07:00
George Hotz
4c9a930de2 rangeify attn tests (#12377) 2025-10-01 09:59:19 +08:00
qazal
109c63b904 update Tensor unit tests for RANGEIFY (#12359)
* update test_kernelize for RANGEIFY

* also kernelizes user contiguous

* skip that test

* tensor uop repr

* 4 kernels, still realizes a float
2025-09-30 11:17:21 +03:00
George Hotz
7129419500 fix cifar training in RANGEIFY (#12355)
* fix cifar training in RANGEIFY

* even more wino fuse

* bugfix

* test to show issue
2025-09-30 15:59:19 +08:00
chenyu
86c5c969ea linalg cosmetic change (#12356) 2025-09-30 03:00:59 -04:00
George Hotz
f522e83a02 fix rangeify elu fusion for openpilot (#12341)
* fix rangeify elu fusion for openpilot

* flip the metadata

* copy over permuted contiguous support

* this is correct

* update that
2025-09-30 11:41:52 +08:00
George Hotz
cdfa0f29fd add rendering to index (#12338) 2025-09-30 09:18:05 +08:00
chenyu
9d2f2b8e34 skip test_mean_half_precision_overflow (#12331)
it only works with SPLIT_REDUCEOP=1
2025-09-29 05:15:04 -04:00
chenyu
76c87d81b3 delete test_backward_sum_acc_dtype (#12330)
this test tests the wrong thing, it was only working because expand realize rule
2025-09-29 04:46:17 -04:00
chenyu
17cec8d645 RANGEIFY winograd test (#12297)
speed seems fine
2025-09-24 23:42:32 -04:00
Sieds Lykles
45c7252aed Better div nesting 2 (#11812)
* remove check

* use fold_divmod_congruence instead of simplify

* adjust tests

* shorten line

* new algo

* add test

* cleanup

* update tests

* ALLOWED_GATED_READ_IMAGE from 16 -> 12

* only remove the call to simplify

* add option to simplify with factor_remainder

* Allowed readimage gates back to 16
2025-09-24 04:50:26 +02:00
Sieds Lykles
6146c64d81 lower the invalid gate last (#12164)
* lowering invalid gate is part of lower_index_dtype

* update test

* remove import

* put that back

* reduce_collapse uses invalid

* fix that pattern to use invalid_pat

* valid creates the right dtype count

* seperate rule for lowering invalid gate

* dont unvectorize Invalid gate

* image_fixup uses Invalid

* update tests

* cleanup

* update split_load_store

* add .scalar() there
2025-09-24 04:27:35 +02:00
chenyu
b54cb272d0 move test_qcom to test/device (#12272) 2025-09-22 21:07:10 -04:00
qazal
4756971c88 skip test_bf16_disk_write_read on CL=1 (#12256) 2025-09-20 17:11:06 +03:00
Sieds Lykles
cc038b31b6 Shrink instead of reshape to unregister symbolic (#12241)
* Slice to unbind symbolic

* use vmax for now

* assert shape in reshape is valid

* update test_symbolic_ops to use shrink instead of reshape

* remove infer_with_bound_values for npw

* symbolic output doesnt have symbolic strides

* symbolic jit tests use shrink to unregister symbolic

* update test

* update more tests

* wrap vmax in int()

* only create a new st if the store is not an assigne

* unwrap st

* comments
2025-09-19 06:04:35 +02:00
Sieds Lykles
8d703a6369 z3 xor doesnt use bitcast (#12243) 2025-09-19 00:31:44 +02:00