George Hotz
aefabaf774
add AxisType to range ( #11798 )
...
* add AxisType to range
* missed them
* fix that test
* fix that test
2025-08-23 11:15:00 -07:00
qazal
b975830424
add profile loader helper in test_viz ( #11797 )
2025-08-23 19:20:29 +03:00
qazal
9ff03680ba
viz: store relative timestamps ( #11787 )
...
* viz: store relative timestamps
* err
* update test
2025-08-22 19:30:21 +03:00
qazal
2e0eb88549
viz: add metadata to UOp tracing ( #11772 )
...
* viz: add metadata to UOp tracing
* place after tag
* optional field
* err, refcount of root must be 0
2025-08-22 00:18:45 +03:00
chenyu
be7b0b6970
TRANSCENDENTAL_SUPPORTED_DTYPES->TRANSCENDENTAL_DTYPES ( #11752 )
2025-08-20 10:29:36 -04:00
ttomsa
70c3f1fb29
x.where(False, True) -> !x ( #11738 )
...
* add pat
* add test
2025-08-19 19:08:16 -04:00
George Hotz
1d307f568c
move device tests to test/device + test cleanups ( #11735 )
...
* move device tests to test/device
* test speedups
* test device
* linalg to unit
* upd
* so pytest just works
* more divide and skip
* speed
* test devectorize
* add pillow
2025-08-19 16:02:20 -07:00
George Hotz
4b3fcb4064
Revert "REDUCE_AXIS keepdim=False ( #11311 )" ( #11718 )
...
This reverts commit b518a7378a .
2025-08-18 13:28:53 -07:00
b1tg
b518a7378a
REDUCE_AXIS keepdim=False ( #11311 )
...
* progress
* fix tests
* fix tests
* remove hack for test_symfold
* fix test_conv.py on llvm
* hack test_cache_speed
* lint
* remove hack for helper_linearizer_opt
* tests
* fix DSP
* clean up
* remove hack for kernelize.py
* hack for test/test_multitensor.py TestMultiTensor.test_matmul_shard_none
* clean
* uop.r need reshape?
* lower_store cause fail
* fix lower?
* avoid contiguous hack
* 2134
* conv2d count
* remove unused
* hack lower
* reduced and clean up
* fix TestMultiTensor.test_matmul_shard_none
* src sync + fix TestMultiTensor.test_matmul_shard_none
* remove excluded in mop
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com >
2025-08-18 10:09:17 -07:00
chenyu
c30a113b2a
support bf16 and fp8 in Tensor.tolist ( #11704 )
...
memoryview does not support it, but casting works fine so cast is fine
2025-08-17 15:11:13 -04:00
qazal
d762edd694
viz: define tracks in python ( #11701 )
...
* viz: defines tracks in python
* update unittests
* figuring it out
* works
* diff cleanup
* math
* y axis is back
2025-08-17 18:19:13 +03:00
qazal
c8ba48b223
show rewrite errors in viz ( #11684 )
2025-08-15 19:09:47 +03:00
George Hotz
560984fd8d
small changes from rangeify ( #11682 )
...
* small changes from rangeify
* const like thing
* ksym
2025-08-15 08:45:52 -07:00
Sieds Lykles
06beeb6e13
Nest div even if factor is negative ( #11666 )
2025-08-14 13:58:59 +02:00
Sieds Lykles
661e9a2d5d
div_and_mod_folding refactor ( #11585 )
...
* divmod const folding is its own function
* split nested mod optimization out of div and mod folding
* make `fold_binary_numerator` its own function
* factor out `fold_divmod_congruence`
* check sign of numerator
* add tests
* assert int on vmin and vmax
* add type: ignore
* factor out more rules
* remove div_and_mod_folding
* cached_property to property
* remove import
* add returns
* restore old order
* check sign of x.vmin and newx.vmin
* check more signs
* add some test that would have caught bugs
* better test if the div simplified
* shorten line
* replace terms_factors_const with pop_const
* move that back
* minor cleanup
* remove comments
* some cleanup
2025-08-14 11:52:42 +02:00
chenyu
4fe19eec72
Ops.TRUNC ( #11659 )
2025-08-13 18:40:48 -04:00
George Hotz
22bdf48cdd
render ranges in viz, name gbufs with sizes. changes from rangeify ( #11656 )
...
* render ranges in viz, name gbufs with sizes. changes from rangeify
* fix unit test dtypes
2025-08-13 12:46:16 -07:00
George Hotz
d2521d828a
transcendental+idiv+threefry are uop decompositions ( #11636 )
...
* transcendental+idiv+threefry are uop decompositions [pr]
* threefry decomp
* fix randomness tests
* fix webgpu
* unneeded now
* fix
* move prematcher
* all cast should probably be cast_vec
2025-08-13 09:37:12 -07:00
Sieds Lykles
4c3982c44e
Take sign out of mod ( #11631 )
...
* Add rule and test
* fix tests
2025-08-12 18:44:36 +02:00
George Hotz
ca41b5e38b
skip_0 in graph rewrite [pr] ( #11627 )
...
* skip_0 in graph rewrite [pr]
* no track_rewrites on test
* use dict instead of set
2025-08-11 18:29:04 -07:00
George Hotz
996c907c0b
rewrite not ready + children machinery ( #11607 )
...
* rewrite not ready + children machinery
* it doesn't like track rewrites
2025-08-10 15:28:30 -07:00
qazal
960cc6533a
pass through name function args in track_rewrites ( #11572 )
2025-08-08 02:28:52 +03:00
George Hotz
82be8abfd2
move opt under codegen ( #11569 )
2025-08-07 14:19:17 -07:00
George Hotz
6ed2dfd187
delete the arange dim mismatch restriction ( #11568 )
...
* delete the arange dim mismatch restriction
* skip that test race
2025-08-07 13:46:17 -07:00
George Hotz
9764c6cdee
fix mismatch reduce, try 2 ( #11560 )
...
* fix mismatch reduce, try 2
* fix heuristic
* delete that test
* don't start allowing ones
2025-08-07 07:57:58 -07:00
George Hotz
a1aa5670aa
Revert "fix mismatch reduce ( #11547 )" ( #11549 )
...
This reverts commit 49d21a9055 .
2025-08-06 22:43:15 -07:00
George Hotz
49d21a9055
fix mismatch reduce ( #11547 )
...
* fix mismatch reduce
* cleanups
* fix shape
* fix mypy
* resolve
2025-08-06 21:12:51 -07:00
George Hotz
21570545d3
move view pushing to codegen, try 2 ( #11534 )
...
* move view pushing to codegen, try 2
* fix up some linearizer tests
* fix test search
* fix test schedule
* delete that test
* fix test arange
* fix a few tests
* update tests
* push views
* ebs cleanup
* fix local/reg
* test and lint
* fix more tests
* test cleanups
* skipped that one
2025-08-06 15:58:38 -07:00
George Hotz
80d9cced07
more test cleanups ( #11544 )
...
* more test cleanups
* revert that
2025-08-06 15:05:21 -07:00
qazal
846a2826ab
viz: remove TracingKey.fmt ( #11482 )
...
* viz: remove TracingKey.fmt
* remove from test too
2025-08-05 00:00:03 +03:00
leopf
4f0ee4e982
BPE tokenizer ( #11415 )
...
* BPE works
* refactor tok
* oops
* basic tests
* fix eval
* smaller diff
* fix error
* proper vocab decoding
* use regex for splitting
* escape ucatrange
* full compat
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-08-04 09:52:38 -07:00
chenyu
e0106b6b25
1/(x*c) -> (1/c)*(1/x) ( #11491 )
...
example: 2*(2*a).reciprocal() -> a.reciprocal()
# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00
chenyu
66be747908
few more dtype cast convinience methods ( #11480 )
2025-08-02 15:47:09 -04:00
chenyu
e22e5da9a5
move some test_dtype tests to unit ( #11479 )
2025-08-02 15:25:00 -04:00
qazal
fa66d9772d
viz: show const node when it's root ( #11456 )
2025-08-01 01:01:58 +03:00
chenyu
d5fc6af4a2
remove unused ShapeTracker.consecutive [pr] ( #11426 )
2025-07-29 18:36:19 -04:00
chenyu
88c338bfcc
add kernelize to keccak for each data block ( #11370 )
...
* add kernelize to keccak for each data block
test_long works now. this prevents internal uops from growing propotional to data length and eventually too deep
* this?
* hash stuff
* gate test
* mv
2025-07-25 16:07:20 -04:00
chenyu
82e6de7fc6
more keccak reference tests ( #11329 )
2025-07-23 22:06:39 -04:00
George Hotz
e14b4fefa5
ranges on store ( #11334 )
...
* ranges on store
* fix store spec
* fix that
* fix gates
* fix tests
* fix ptx
2025-07-22 21:00:50 -07:00
chenyu
4535908679
update keccak test_long ( #11331 )
...
it should compare with arg "shake_128"
2025-07-22 16:08:01 -04:00
qazal
6668d6d241
fix word_wrap with newlines in input string [pr] ( #11319 )
2025-07-22 12:03:13 +03:00
George Hotz
842184a1ab
rename kernelize to schedule, try 2 ( #11305 )
2025-07-21 11:18:36 -07:00
wozeparrot
30ce16a424
feat: failing test for long keccak ( #11292 )
2025-07-21 12:49:23 -04:00
nimlgen
188ed38315
replace from_mv with lightweight mv_address ( #11280 )
2025-07-19 13:50:51 +03:00
quortus
924bc7c9ae
Fix test_uop_spec ( #11259 )
2025-07-16 11:02:31 +03:00
Alisher Zhubanyshev
4ef6b46b34
hcq: reduce launch overhead ( #11193 )
...
* nv: improve mmio creation speed
* add memoryview test
* fix indents
* move mv bench to `test_helpers`, remove comparison
2025-07-13 19:25:50 +03:00
chenyu
73caa5dd1b
remove Kernel.membufs [pr] ( #11200 )
2025-07-12 14:48:47 -04:00
qazal
d3ec63a5c3
viz: add base class for unittests ( #11178 )
2025-07-11 13:58:03 +03:00
chenyu
7db07e5f2c
don't narrow range of CAST on bool/unsigned ( #11156 )
2025-07-09 22:20:09 -04:00
George Hotz
53ae153404
tc should be in opt ( #11148 )
...
* tc should be in opt [pr]
* fix import
2025-07-09 14:12:21 -07:00