qazal
bf31585444
check assign buffers in group [pr] ( #7527 )
2024-11-04 20:27:22 +08:00
qazal
9fe596ce6e
early assert assign [pr] ( #7526 )
...
* early assert assign [pr]
* self
* don't need base
2024-11-04 20:04:39 +08:00
George Hotz
e2204378d9
more GroupOp [pr] ( #7524 )
2024-11-04 18:40:06 +08:00
George Hotz
c1585bcc9e
flatten ops ( #7523 )
...
* flatten ops
* fix mypy
2024-11-04 18:07:23 +08:00
George Hotz
9c3ee64a3e
hotfix: QoL assert if op is a str
2024-11-04 17:11:38 +08:00
George Hotz
0c19b6298b
rename ops to have unique names ( #7522 )
2024-11-04 17:09:45 +08:00
George Hotz
9a7cc04843
fix viz [pr] ( #7519 )
...
* fix viz [pr]
* Update serve.py
2024-11-04 15:02:41 +08:00
George Hotz
6bb230287b
pass the src into Metal [pr] ( #7518 )
...
* pass the src into Metal [pr]
* put that comment back
* keep old functionality
* move all to disassembler
* metal supports parallel beam
* touchups
* comment in correct place
2024-11-04 12:35:30 +08:00
George Hotz
bac251d2c1
idx_load_store in lowerer [pr] ( #7477 )
...
* idx_load_store in lowerer [pr]
* fix tests (#7513 )
Co-authored-by: John Doe <null@mail.com >
* work
---------
Co-authored-by: Carl Basho <76494676+oldpondplop@users.noreply.github.com >
Co-authored-by: John Doe <null@mail.com >
2024-11-04 10:18:40 +08:00
chenyu
7758f7211b
Revert "s/UPat/Pat ( #7506 )" [pr] ( #7517 )
...
* Revert "s/UPat/Pat (#7506 )"
This reverts commit 400011a8c1 .
* fix
2024-11-03 16:33:02 -05:00
chenyu
e641bbc859
safe softmax trick in MCTS ucb_explored_children ( #7515 )
...
* safe softmax trick in MCTS ucb_explored_children
fixed
```
File "numpy/random/mtrand.pyx", line 971, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities contain NaN
```
when all ucb_explored_children are big negative numbers result in all NaN probabilities
* better type
2024-11-03 15:59:31 -05:00
chenyu
3ef3b5b5f8
simpler ops_python CAST ( #7514 )
2024-11-03 14:40:43 -05:00
chenyu
df49439b9a
remove reassoc from LLVM flags ( #7512 )
...
reassoc reorders compute and breaks transcendental
2024-11-03 13:11:56 -05:00
chenyu
2f70fb893e
move transcendental fuzzer test to test_transcendental ( #7511 )
2024-11-03 12:36:50 -05:00
chenyu
84592225d8
tweak tqdm ( #7510 )
...
reduce parentheses and fuzz more tests now there's no sleep
2024-11-03 12:07:11 -05:00
chenyu
c25a69b97e
fix tqdm tests ( #7509 )
...
time.sleep masked two issues:
(1) iters_per_sec might have unitscale in it, and calling `float` on it fails
(2) default rate is too low to ensure the output matches, it might skip updating
2024-11-03 10:53:22 -05:00
chenyu
4617c9a565
move COMMUTATIVE flipping to symbolic ( #7507 )
...
* move COMMUTATIVE flipping to symbolic
it cannot go with TRANSCENDENTAL
* skip LLVM
2024-11-03 09:03:45 -05:00
qazal
50ea2105e5
clean schedule ctx after fusing arange/conv_bw [pr] ( #7508 )
2024-11-03 21:55:14 +08:00
chenyu
400011a8c1
s/UPat/Pat ( #7506 )
2024-11-03 08:26:19 -05:00
qazal
37f8578953
s/BUFFER_UOPS/BUFOPS ( #7501 )
2024-11-03 10:17:33 +02:00
George Hotz
c8bf09b7d4
s/UOps/Ops ( #7500 )
...
* s/UOps/Ops [pr]
* fix
2024-11-03 11:26:10 +08:00
George Hotz
d078dcd0c8
TrackedPatternMatcher needs to loop [pr] ( #7499 )
2024-11-03 11:18:58 +08:00
George Hotz
6f93e91deb
hotfix: lower mnist threshold for non determinism
2024-11-03 11:05:12 +08:00
George Hotz
06f476b371
late transcendental ( #7498 )
2024-11-03 10:53:58 +08:00
chenyu
91a3b27fa9
disable test_setitem_inplace_operator again ( #7495 )
...
it was flaky, not broken broken
2024-11-02 19:01:23 -04:00
chenyu
ba0c246cfd
update test_setitem_overlapping_inplace1 ( #7494 )
...
failed on LLVM and remu, not real AMD
2024-11-02 18:40:53 -04:00
chenyu
f887de0fd6
update test_setitem ( #7493 )
...
some tests passed now
2024-11-02 17:53:04 -04:00
chenyu
49ae2df036
ConstLike type [pr] ( #7492 )
...
* ConstLike type [pr]
`ConstLike = ConstType|Variable|Tuple[ConstType, ...]`. fixed the wrong `Tuple[ConstType]`
* old pylint
2024-11-02 17:13:17 -04:00
chenyu
dc9ffb41a8
cleanup multi [pr] ( #7491 )
2024-11-02 16:38:34 -04:00
chenyu
f8376b3766
all_reduce cosmetic change [pr] ( #7490 )
2024-11-02 15:40:55 -04:00
chenyu
baaec39ffc
update get_transcendental_patterns [pr] ( #7489 )
...
i think ths is better than `(p[0], cast(Callable, p[1]))`
2024-11-02 14:25:31 -04:00
chenyu
55bd136746
clean up reshape_and_permute ( #7488 )
...
probably will rewrite it later as reshape and permute function on Kernel, but for now it's shorter with better types
2024-11-02 13:44:14 -04:00
chenyu
74c7b9d84a
clean up Kernel.name ( #7486 )
...
* clean up Kernel.name
* narrow that str
2024-11-02 12:48:37 -04:00
geohotstan
b1866cbfd9
failure test case for pool ops ( #7483 )
...
* add failure test case
* minimum case
2024-11-02 12:13:38 -04:00
geohotstan
585f3a0f24
Add isinf and isnan ops to Tensor ( #7484 )
...
* move isinf and isnan to new branch
* sneak a roll documentation fix in
* add to docs
* update test coverage for detect_positive and detect_negative
* add types to isinf args
2024-11-02 12:12:52 -04:00
George Hotz
72a9ac27e9
support image dtype in cloud [pr] ( #7482 )
...
* support image dtype in cloud [pr]
* remove outdated osx hack
* unused imports
2024-11-02 23:54:27 +08:00
qazal
24d7fde63d
early skip const [pr] ( #7480 )
2024-11-02 13:18:45 +08:00
qazal
c56364fad0
realize before copy rule [pr] ( #7476 )
...
* realize before COPY and BUFFER_VIEW rule [pr]
* only upat.view
* move the assert to lazy
2024-11-02 13:07:27 +08:00
qazal
3819f5cf4d
realize meta ops from graph_rewrite [pr] ( #7474 )
2024-11-02 01:48:57 +02:00
qazal
e149777b52
start realizing from big graph [pr] ( #7473 )
2024-11-02 00:27:38 +02:00
qazal
2a1aa55882
add realizes to context [pr] ( #7470 )
...
* add realizes set
* add from fuse
2024-11-02 00:00:30 +02:00
qazal
e3ea7cc4b4
prep refactor to UPatLoadStore [pr] ( #7472 )
...
* prep refactor to UPatLoadStore [pr]
* [pr]
2024-11-01 23:50:20 +02:00
qazal
6febd20fcf
set forced_realize for outputs [pr] ( #7469 )
2024-11-01 20:03:12 +02:00
Tobias Fischer
7c9a1d69f9
sdxl gen fix ( #7459 )
2024-11-01 13:57:01 -04:00
ignaciosica
9c832483f2
update shifts spec ( #7468 )
...
* update shifts spec
* hotfix: old style
2024-11-01 12:40:41 -04:00
ignaciosica
18bd98c203
Add shl and shr to llvmir ( #7449 )
...
* add shl and shr to llvmir
* hotfix: enforce type alignment for shr and shl in all backends
* hotfix: change shl and shr spec
* hotfix: typo
* hotfix: refactor shl and shr rules and add casting to ptx shl
* hotfix: bug
* hotfix: ptx shl and shr require buint32
* hotfix: cleanups
2024-11-01 23:49:34 +08:00
chenyu
18e159c9ac
comment about multi real and more tests [pr] ( #7467 )
2024-11-01 11:49:11 -04:00
chenyu
1f343aa40e
replace x.alu(BinaryOps.ADD, y) with add in multi [pr] ( #7466 )
2024-11-01 10:50:57 -04:00
geohotstan
6513690223
Add Tensor.hardsigmoid ( #7433 )
...
* move hardsigmoid to new branch
* add to test
* add NOTE to mention differing values for alpha and beta that match torch
* shift from relu6
* correct shift implementation
* or we just use relu? no more 666
2024-11-01 08:36:52 -04:00
George Hotz
fe78ed8cb7
improve match speed [pr] ( #7465 )
...
* improve match speed [pr]
* no sym in expand
* remove useless rule, sym back
* don't track that
2024-11-01 17:33:53 +08:00