Commit Graph

10633 Commits

Author SHA1 Message Date
qazal
bf31585444 check assign buffers in group [pr] (#7527) 2024-11-04 20:27:22 +08:00
qazal
9fe596ce6e early assert assign [pr] (#7526)
* early assert assign [pr]

* self

* don't need base
2024-11-04 20:04:39 +08:00
George Hotz
e2204378d9 more GroupOp [pr] (#7524) 2024-11-04 18:40:06 +08:00
George Hotz
c1585bcc9e flatten ops (#7523)
* flatten ops

* fix mypy
2024-11-04 18:07:23 +08:00
George Hotz
9c3ee64a3e hotfix: QoL assert if op is a str 2024-11-04 17:11:38 +08:00
George Hotz
0c19b6298b rename ops to have unique names (#7522) 2024-11-04 17:09:45 +08:00
George Hotz
9a7cc04843 fix viz [pr] (#7519)
* fix viz [pr]

* Update serve.py
2024-11-04 15:02:41 +08:00
George Hotz
6bb230287b pass the src into Metal [pr] (#7518)
* pass the src into Metal [pr]

* put that comment back

* keep old functionality

* move all to disassembler

* metal supports parallel beam

* touchups

* comment in correct place
2024-11-04 12:35:30 +08:00
George Hotz
bac251d2c1 idx_load_store in lowerer [pr] (#7477)
* idx_load_store in lowerer [pr]

* fix tests (#7513)

Co-authored-by: John Doe <null@mail.com>

* work

---------

Co-authored-by: Carl Basho <76494676+oldpondplop@users.noreply.github.com>
Co-authored-by: John Doe <null@mail.com>
2024-11-04 10:18:40 +08:00
chenyu
7758f7211b Revert "s/UPat/Pat (#7506)" [pr] (#7517)
* Revert "s/UPat/Pat (#7506)"

This reverts commit 400011a8c1.

* fix
2024-11-03 16:33:02 -05:00
chenyu
e641bbc859 safe softmax trick in MCTS ucb_explored_children (#7515)
* safe softmax trick in MCTS ucb_explored_children

fixed
```
  File "numpy/random/mtrand.pyx", line 971, in numpy.random.mtrand.RandomState.choice
ValueError: probabilities contain NaN
```
when all ucb_explored_children are big negative numbers result in all NaN probabilities

* better type
2024-11-03 15:59:31 -05:00
chenyu
3ef3b5b5f8 simpler ops_python CAST (#7514) 2024-11-03 14:40:43 -05:00
chenyu
df49439b9a remove reassoc from LLVM flags (#7512)
reassoc reorders compute and breaks transcendental
2024-11-03 13:11:56 -05:00
chenyu
2f70fb893e move transcendental fuzzer test to test_transcendental (#7511) 2024-11-03 12:36:50 -05:00
chenyu
84592225d8 tweak tqdm (#7510)
reduce parentheses and fuzz more tests now there's no sleep
2024-11-03 12:07:11 -05:00
chenyu
c25a69b97e fix tqdm tests (#7509)
time.sleep masked two issues:
(1) iters_per_sec might have unitscale in it, and calling `float` on it fails
(2) default rate is too low to ensure the output matches, it might skip updating
2024-11-03 10:53:22 -05:00
chenyu
4617c9a565 move COMMUTATIVE flipping to symbolic (#7507)
* move COMMUTATIVE flipping to symbolic

it cannot go with TRANSCENDENTAL

* skip LLVM
2024-11-03 09:03:45 -05:00
qazal
50ea2105e5 clean schedule ctx after fusing arange/conv_bw [pr] (#7508) 2024-11-03 21:55:14 +08:00
chenyu
400011a8c1 s/UPat/Pat (#7506) 2024-11-03 08:26:19 -05:00
qazal
37f8578953 s/BUFFER_UOPS/BUFOPS (#7501) 2024-11-03 10:17:33 +02:00
George Hotz
c8bf09b7d4 s/UOps/Ops (#7500)
* s/UOps/Ops [pr]

* fix
2024-11-03 11:26:10 +08:00
George Hotz
d078dcd0c8 TrackedPatternMatcher needs to loop [pr] (#7499) 2024-11-03 11:18:58 +08:00
George Hotz
6f93e91deb hotfix: lower mnist threshold for non determinism 2024-11-03 11:05:12 +08:00
George Hotz
06f476b371 late transcendental (#7498) 2024-11-03 10:53:58 +08:00
chenyu
91a3b27fa9 disable test_setitem_inplace_operator again (#7495)
it was flaky, not broken broken
2024-11-02 19:01:23 -04:00
chenyu
ba0c246cfd update test_setitem_overlapping_inplace1 (#7494)
failed on LLVM and remu, not real AMD
2024-11-02 18:40:53 -04:00
chenyu
f887de0fd6 update test_setitem (#7493)
some tests passed now
2024-11-02 17:53:04 -04:00
chenyu
49ae2df036 ConstLike type [pr] (#7492)
* ConstLike type [pr]

`ConstLike = ConstType|Variable|Tuple[ConstType, ...]`. fixed the wrong `Tuple[ConstType]`

* old pylint
2024-11-02 17:13:17 -04:00
chenyu
dc9ffb41a8 cleanup multi [pr] (#7491) 2024-11-02 16:38:34 -04:00
chenyu
f8376b3766 all_reduce cosmetic change [pr] (#7490) 2024-11-02 15:40:55 -04:00
chenyu
baaec39ffc update get_transcendental_patterns [pr] (#7489)
i think ths is better than `(p[0], cast(Callable, p[1]))`
2024-11-02 14:25:31 -04:00
chenyu
55bd136746 clean up reshape_and_permute (#7488)
probably will rewrite it later as reshape and permute function on Kernel, but for now it's shorter with better types
2024-11-02 13:44:14 -04:00
chenyu
74c7b9d84a clean up Kernel.name (#7486)
* clean up Kernel.name

* narrow that str
2024-11-02 12:48:37 -04:00
geohotstan
b1866cbfd9 failure test case for pool ops (#7483)
* add failure test case

* minimum case
2024-11-02 12:13:38 -04:00
geohotstan
585f3a0f24 Add isinf and isnan ops to Tensor (#7484)
* move isinf and isnan to new branch

* sneak a roll documentation fix in

* add to docs

* update test coverage for detect_positive and detect_negative

* add types to isinf args
2024-11-02 12:12:52 -04:00
George Hotz
72a9ac27e9 support image dtype in cloud [pr] (#7482)
* support image dtype in cloud [pr]

* remove outdated osx hack

* unused imports
2024-11-02 23:54:27 +08:00
qazal
24d7fde63d early skip const [pr] (#7480) 2024-11-02 13:18:45 +08:00
qazal
c56364fad0 realize before copy rule [pr] (#7476)
* realize before COPY and BUFFER_VIEW rule [pr]

* only upat.view

* move the assert to lazy
2024-11-02 13:07:27 +08:00
qazal
3819f5cf4d realize meta ops from graph_rewrite [pr] (#7474) 2024-11-02 01:48:57 +02:00
qazal
e149777b52 start realizing from big graph [pr] (#7473) 2024-11-02 00:27:38 +02:00
qazal
2a1aa55882 add realizes to context [pr] (#7470)
* add realizes set

* add from fuse
2024-11-02 00:00:30 +02:00
qazal
e3ea7cc4b4 prep refactor to UPatLoadStore [pr] (#7472)
* prep refactor to UPatLoadStore [pr]

* [pr]
2024-11-01 23:50:20 +02:00
qazal
6febd20fcf set forced_realize for outputs [pr] (#7469) 2024-11-01 20:03:12 +02:00
Tobias Fischer
7c9a1d69f9 sdxl gen fix (#7459) 2024-11-01 13:57:01 -04:00
ignaciosica
9c832483f2 update shifts spec (#7468)
* update shifts spec

* hotfix: old style
2024-11-01 12:40:41 -04:00
ignaciosica
18bd98c203 Add shl and shr to llvmir (#7449)
* add shl and shr to llvmir

* hotfix: enforce type alignment for shr and shl in all backends

* hotfix: change shl and shr spec

* hotfix: typo

* hotfix: refactor shl and shr rules and add casting to ptx shl

* hotfix: bug

* hotfix: ptx shl and shr require buint32

* hotfix: cleanups
2024-11-01 23:49:34 +08:00
chenyu
18e159c9ac comment about multi real and more tests [pr] (#7467) 2024-11-01 11:49:11 -04:00
chenyu
1f343aa40e replace x.alu(BinaryOps.ADD, y) with add in multi [pr] (#7466) 2024-11-01 10:50:57 -04:00
geohotstan
6513690223 Add Tensor.hardsigmoid (#7433)
* move hardsigmoid to new branch

* add to test

* add NOTE to mention differing values for alpha and beta that match torch

* shift from relu6

* correct shift implementation

* or we just use relu? no more 666
2024-11-01 08:36:52 -04:00
George Hotz
fe78ed8cb7 improve match speed [pr] (#7465)
* improve match speed [pr]

* no sym in expand

* remove useless rule, sym back

* don't track that
2024-11-01 17:33:53 +08:00