George Hotz
d1223922b1
fixed and test is real
2025-12-04 16:52:11 -08:00
George Hotz
05c4b18f91
Merge branch 'master' into sched_cache
2025-12-04 16:46:23 -08:00
ayanhan
edf929ec9d
fix: add __delitem__ to Tensor with proper TypeError ( #13561 )
2025-12-04 00:53:08 -08:00
Christopher Milan
0a54434b15
mitigate ctypes c_bool bitfield bug ( #13558 )
...
* mitigate ctypes c_bool bitfield bug
* don't delete old test
2025-12-03 20:46:04 -05:00
George Hotz
bf5de6ba5f
delete abstractions2
2025-12-03 15:02:20 -08:00
George Hotz
723179dfd6
Merge branch 'master' into sched_cache
2025-12-03 13:43:58 -08:00
chenyu
22777a89ea
minor test_uop_symbolic updates ( #13551 )
2025-12-03 13:17:44 -05:00
chenyu
a205f98ef4
tighter bound for MOD ( #13550 )
2025-12-03 11:24:29 -05:00
nimlgen
549f3287a8
fix caching for fetch ( #13544 )
2025-12-03 14:34:14 +03:00
George Hotz
81bafb1af3
Merge branch 'master' into sched_cache
2025-12-02 19:59:48 -08:00
George Hotz
6bd355fa26
add needs_second_gpu decorator ( #13543 )
...
* add needs_second_gpu decorator
* more skips
* two more fixes
2025-12-02 19:08:23 -08:00
Roelof van Dijk
c158e3c988
add cifar gated uop_given_valid regression test ( #13536 )
2025-12-02 16:02:47 -05:00
George Hotz
7f7aa0a7f8
start work on schedule cache
2025-12-02 07:44:10 -08:00
nimlgen
77a76d1b13
device: respect compiler ContextVars ( #13523 )
...
* device: envvars for cc
* fix
* fix
* x
* um
* fix
* remote
* em
* cleanup
* typing
* fix
* debug
* lvp?
* ugh
* singl
* rm
* lol
* fix
* ?
* this?
* why?
* rev
* mod test
* l
2025-12-02 14:42:04 +03:00
George Hotz
c38b7684dc
improve microbenchmarks ( #13492 )
...
* improve microbenchmarks
* bugfix + ubench
* lil
* no src in const method
2025-11-29 10:15:22 -08:00
qazal
72ef533d9c
tracing: use u32 for buffer args encoding ( #13472 )
2025-11-28 00:19:51 +08:00
George Hotz
e4cd649ff0
remove kernelize to prepare for refactors ( #13463 )
...
* remove kernelize to prepare for refactors
* less kernelize
* last test
2025-11-26 14:18:50 -08:00
qazal
7238df7a94
viz: cleanup sort_fn ( #13454 )
2025-11-26 04:10:10 +08:00
wozeparrot
249553a119
tinyfs tweaks ( #13444 )
2025-11-24 18:07:32 -08:00
chenyu
cb29265f23
add test that shows the validhack regression with bad rewrite order ( #13411 )
2025-11-21 13:48:30 -05:00
chenyu
0251a8e628
parse_valid minor cleanup [pr] ( #13385 )
...
* stricter parse_valid [pr]
* not stricter
* no VCONST
* Revert "no VCONST"
This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.
2025-11-20 13:15:06 -05:00
George Hotz
986d113024
symbolic fuzz failure ( #13367 )
...
* symbolic fuzz failure
* skip flaky test
2025-11-19 14:21:08 -08:00
George Hotz
05ccc69248
Revert "merge to fold_divmod_general [p] ( #13359 )"
...
This reverts commit 7711bbac7f .
2025-11-19 14:18:09 -08:00
George Hotz
7711bbac7f
merge to fold_divmod_general [p] ( #13359 )
...
* merge to fold_divmod_general [p]
* merge more
* merge more
* merge more
2025-11-19 11:37:45 -08:00
George Hotz
957cf717e7
Python speed ( #13355 )
...
* skip process replay by default
* work on python speed
* fix names of rewrite rules
* fix that test
2025-11-19 09:03:00 -08:00
Christopher Milan
a438c277de
autogen tests for 3.14 ( #13343 )
2025-11-18 22:16:59 -05:00
George Hotz
cabd4add48
more work parsing SQTT, separate VIZ/PROFILE ( #13308 )
...
* more work parsing SQTT
* more minimal runner
* sep VIZ/PROFILE
* parse print new
* improve parser
* more filter
* that
* split them
* lil cleanup
* skip flaky test
* AQL in mmapeak
2025-11-16 10:40:39 -08:00
nimlgen
c80d459d99
autogen: fix packed args structs ( #13274 )
...
* autogen: fix packed args structs
* and test this
2025-11-14 20:24:06 +08:00
George Hotz
bcdfc109b5
hotfix: disable flaky test
2025-11-13 06:19:28 -08:00
George Hotz
ab9fa964d8
DISABLE_COMPILER_CACHE -> CCACHE ( #13234 )
...
* DISABLE_COMPILER_CACHE -> CCACHE
* Fix cachekey assignment in Compiler constructor
2025-11-12 15:07:09 -08:00
qazal
7a6853fa40
viz: show python callstack in the first graph ( #13218 )
2025-11-12 20:52:28 +08:00
Christopher Milan
41a098a82d
In-tree autogen: libc.py ( #13217 )
...
* checkout changes from autogen branch
* parents
* pylint happy
* move sys to system in helpers.py
* typo
* typo
2025-11-11 19:13:48 -08:00
qazal
bc55bc4849
cleanup test_viz profiler tests ( #13221 )
2025-11-12 03:46:48 +08:00
nimlgen
b8e48effcb
device: no compilers message with reasons ( #13146 )
...
* device: no compilers message with reasons
* typings
* mypy
2025-11-07 23:01:45 +08:00
chenyu
bb8cf948f2
variation of (x%c)+(x//c)*c = x ( #13135 )
...
when x is in the form of y//b, the idiv term might have combined
2025-11-06 18:53:28 -05:00
George Hotz
bcfe42937f
move permute/flip/shrink to mixins ( #13113 )
...
* move permute to mixins
* move more stuff
* two more
* fix local mypy
* fix tests
* fix shrink
2025-11-05 14:14:15 -08:00
Sieds Lykles
3dc593c536
add strip_params to pyrender ( #13021 )
...
* add strip_params to pyrender
* update that one too
* strip_parens fix
* cleaner
* add test
* add some more tests
* cleaner strip_parens
2025-10-31 14:15:56 +01:00
Sieds Lykles
4c8362128b
New symbolic renderer + strip parens ( #13017 )
...
* new uop renderer
* better tester
* strip parens
* update tests
* split method check_uop_against_string
* use ctx.update instead of add_rendered method
* strip parens based on precedence
* update test
* new symbolic renderer
* add comment
2025-10-30 16:41:32 +01:00
George Hotz
2da02f1ae1
add loads at the end ( #12988 )
...
* add loads at the end
* simpler
* late load
* tests passing
* fix matvec
* spec test passes
* fix where on load
* fix abs2
* fix more tests
2025-10-30 10:42:19 +08:00
Sieds Lykles
70bce62c67
dont collapse possibly empty symbolic range ( #12994 )
...
* dont collapse a symbolic range based on min/max
* refactor z3 renderer
* include sink explicitely instead of dtypes.void
* use dtype.scalar()
2025-10-29 12:17:09 +01:00
Sieds Lykles
9f39f6391c
shared_codegen_spec and fix index spec ( #12967 )
...
* split shared_codegen_spec and fix index
* add VCONST to program_spec and move index to shared_codegen_spec
* working ignore_oob=0
* cleanup
* fix spec
* undo that
* move barrier and special earlier
* fix more spec issues
* more updates
* remove special from program_spec
* cleanup and fixes
* move more to shared
* special is not in shared_spec
* some comments
* dont do bounds check there
2025-10-29 09:14:11 +01:00
chenyu
e18922f111
limit AND const min max to ints [pr] ( #12918 )
2025-10-25 16:07:52 -04:00
George Hotz
b4f6a2c7a3
add kernel spec ( #12911 )
...
* add kernel spec
* fix kernel spec
2025-10-25 11:49:20 +08:00
George Hotz
8a941d95a4
SPEC=2 is full spec, SPEC=1 is default ( #12910 )
...
* SPEC=1 passes all tests
* just use SPEC, not __debug__
2025-10-25 11:10:43 +08:00
wozeparrot
9dac505565
variable bs keccak ( #10731 )
2025-10-23 14:10:21 -07:00
George Hotz
7762b3558b
clean up the spec ( #12868 )
...
* tighten up the spec
* move validate into a different file
* that moved to validate
* after(barr)
2025-10-22 19:50:42 +08:00
George Hotz
d711a4b933
delete old linearizer ( #12834 )
...
* new linearizer with early endrange
* cleanups
* second stage removal
* not store
* do that later
* end cleanup
* fix globals
* end
* multi end
* fix ends earlier
* work
* do_merge_ends
* mini change
* range_gate
* fix cpu
* test fixups
* ranges on index
* not for ptx
* delete linearizer
* remove more junk
* delete that test
* we insert endif
* all ends
2025-10-21 17:52:18 +08:00
George Hotz
c780cd9abb
new linearizer with early endrange ( #12823 )
...
* new linearizer with early endrange
* cleanups
* second stage removal
* not store
* do that later
* end cleanup
* fix globals
* end
* multi end
* fix ends earlier
* work
* do_merge_ends
* mini change
* range_gate
* fix cpu
* test fixups
* ranges on index
* not for ptx
2025-10-21 17:37:48 +08:00
qazal
32af1ff84b
viz graph drawing small cleanups ( #12830 )
...
* viz graph drawing small cleanups
* str literal
2025-10-21 15:51:32 +08:00
George Hotz
2e9082e0bc
after op ( #12801 )
...
* after op
* fix tests
2025-10-20 12:27:56 +08:00