George Hotz
be1806df47
fast sym infer [pr] ( #7177 )
...
* fast sym infer [pr]
* fix pylint
2024-10-21 17:31:32 +08:00
George Hotz
4af228e9fc
hotfix: pin mypy
2024-10-21 16:22:24 +08:00
leopf
b6d9b276bb
GGUF support ( #7046 )
...
* basic loader, untested
* testing
* remove utils import in test
* q8_0
* q4_1
* end to end testing
* minor cleanup
* fix casting
* moved to state
* move tests
* move dequant to fn
* fix lint elif
* remove gguf from extra
* fix dict union
* q6_k simpler
* naming and spacing
* gpt2-gguf example
* cleanup
* move gguf example
* minor cleanup
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-10-21 16:15:34 +08:00
George Hotz
17e7d8f10e
hotfix: fix sz on windows
2024-10-21 16:02:23 +08:00
ignaciosica
87a1e76745
Refactor hip_bfloat16 cast into uop ( #7143 )
...
* refactor hip_bfloat16 cast into uops
* hotfix: linter issue
* hotfix: comment decorator in test
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-10-21 15:17:14 +08:00
qazal
8074c0ec8f
skip test_bfloat16_unary on AMD ( #7169 )
2024-10-21 01:00:47 +03:00
qazal
713461129b
scheduler ast rewrite reorders from big graph [pr] ( #7168 )
...
* scheduler ast rewrite reorders from big graph [pr]
* update test_uops.py
2024-10-21 00:47:58 +03:00
nimlgen
81349213c0
nv min regs count is 16 ( #7166 )
2024-10-20 20:03:55 +03:00
qazal
1383df95af
track_rewrites by function call [pr] ( #7165 )
...
* named track_rewrites [pr]
* group all of create_schedule_with_vars
2024-10-20 17:45:25 +03:00
chenyu
a9ab7db054
don't raise ValueError in uop_given_valid [pr] ( #7163 )
2024-10-19 20:05:04 -04:00
chenyu
98de58260b
simplify valid itself ( #7112 )
2024-10-19 19:39:25 -04:00
chenyu
f511ad9103
No pyint again ( #7156 )
...
* Revert "bring back pyint (#7150 )"
This reverts commit 37e83ca6fc .
* remove truncate in const folding
* truncate_output=False
2024-10-19 13:48:59 -04:00
qazal
30989fb459
changes from the big graph branch [pr] ( #7160 )
...
* metaops srcs
* delete multioutput ctx var
* always has metadata
* shorter path for realized
* this still needs inputs
This reverts commit a59cbb2886 .
2024-10-19 16:22:37 +03:00
chenyu
11beb67400
fix import of truncate ( #7157 )
...
truncate was moved to dtype
2024-10-18 18:41:41 -04:00
nimlgen
54c6a317f8
test_failure_54 ( #7155 )
...
* test_failure_54
* metal
2024-10-18 23:31:18 +03:00
nimlgen
99fb115791
cuda correct pointer type ( #7153 )
2024-10-18 22:39:59 +03:00
chenyu
37e83ca6fc
bring back pyint ( #7150 )
...
fixed test_failure_52 and resnet. need to understand this better
2024-10-18 14:54:37 -04:00
Jacky Lee
c8b59416d0
fix: find_library can be None ( #7145 )
2024-10-18 20:50:52 +03:00
George Hotz
b0a13896d7
PtrDType is dataclass [pr] ( #7125 )
...
* PtrDType is dataclass [pr]
* new dataset
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-18 09:40:33 -04:00
chenyu
ea016b55d1
don't throw in fuzz_linearizer ( #7148 )
...
already broken on master and needs fix. don't throw to not block other pr
2024-10-18 09:28:30 -04:00
chenyu
ea2efbf508
Add Opt(op=OptOps.LOCAL, axis=6, amt=2) to actions ( #7147 )
...
* Add Opt(op=OptOps.LOCAL, axis=6, amt=2) to actions
it's missing if we rebuild all kernels, not just the first 2k.
```
PYTHONPATH="." GPU=1 python3 extra/optimization/get_action_space.py
29%|█████████████████████████████████████▋ | 3682/12701 [01:42<04:11, 35.83it/s]Traceback (most recent call last):
File "/Users/chenyu/code/tinygrad/extra/optimization/get_action_space.py", line 27, in <module>
test_rebuild(lin)
File "/Users/chenyu/code/tinygrad/extra/optimization/get_action_space.py", line 11, in test_rebuild
assert o in actions, f"{o} is not in actions"
^^^^^^^^^^^^
AssertionError: Opt(op=OptOps.LOCAL, axis=6, amt=2) is not in actions
```
* break
2024-10-18 09:03:24 -04:00
qazal
4cf7cca91a
delete fuzz_schedule [pr] ( #7144 )
2024-10-18 15:09:39 +03:00
Bhavya Gada
b7b2017cb9
only ignore warnings not errors ( #7146 )
2024-10-18 07:41:11 -04:00
ignaciosica
8bcdd7c97d
Refactor AMD pm rules to remove handwritten bf16 bool alus ( #7136 )
...
* refactor pm rules
- remove unused handwritten methods
- refactor amd pm rules to fix bug with bool alu
* add bf16 bool alu tests
* add bf16 tests
* hotfix: make atol consistent
2024-10-18 09:00:46 +08:00
Bhavya Gada
534597e753
fix all test warnings ( #7024 )
...
* fix pytorch warning in nn.conv2d for same padding
* fix future warning in torch load
* fix overflow warning in tensor list test: https://github.com/numpy/numpy/issues/23606#issuecomment-1512752172
* fix floating point warnings in dtype tests using docs https://numpy.org/doc/stable/reference/generated/numpy.errstate.html and a neat solution https://stackoverflow.com/questions/53634965/change-np-seterr-behavior-inside-a-function-only
* put err state in one place; comment taken care of by function hover
* enter np errstate context manager on test setup
* put decorator on class
2024-10-18 08:56:40 +08:00
chenyu
0cd4b93441
remove CStyleLanguage from test_uop_symbolic ( #7142 )
2024-10-17 19:39:34 -04:00
chenyu
72ed66205d
enable test_resnet_half ( #7141 )
...
already worked so just fixed the test
2024-10-17 19:02:20 -04:00
nimlgen
211d9753f8
nv more lc checks ( #7139 )
...
* nv more lc checks
* revert
* linter
2024-10-18 00:21:53 +03:00
chenyu
12ff52b88b
test_failure_52 fails on real METAL ( #7138 )
2024-10-17 15:37:28 -04:00
chenyu
84e98900e8
test linearizer failure 53 ( #7137 )
...
variable scope issue caused compile error
2024-10-17 15:23:43 -04:00
qazal
a64e5d0430
graph rewrite all metaops ( #7134 )
2024-10-17 18:49:20 +03:00
nimlgen
45db7d9045
fuzz qcom vs opencl ( #7130 )
...
* fuzz qcom vs opencl
* fix nv
* bettre?
* typo
* open both devs
2024-10-17 18:49:08 +03:00
qazal
188eef959d
early rewrite UOps.CONTIGUOUS ( #7132 )
...
* early rewrite UOps.CONTIGUOUS
* add metaops too
* just the contig diff
2024-10-17 18:35:19 +03:00
chenyu
287a198c4f
increase test_strongly_connected_DAG threshold ( #7131 )
...
flaky
2024-10-17 11:08:50 -04:00
George Hotz
c23ef7e2f8
real_remove_const ( #7128 )
2024-10-17 21:58:41 +08:00
qazal
2087abc999
get membufs with dedup [pr] ( #7127 )
2024-10-17 16:06:06 +03:00
George Hotz
be9a433a60
fix a bug in flops counting + touchups [pr] ( #7126 )
2024-10-17 21:02:11 +08:00
qazal
a2eefa6f97
move assign st override to upat ( #7122 )
...
* move assign st override to upat
* merge view
2024-10-17 13:33:37 +03:00
George Hotz
ded1b38b84
minor dtype cleanup [pr] ( #7124 )
...
* minor dtype cleanup [pr]
* use ptr() function
2024-10-17 17:41:23 +08:00
George Hotz
0b2621f63f
improve render_dtype [pr] ( #7117 )
...
* improve render_dtype [pr]
* don't deref in index
2024-10-17 14:50:40 +08:00
George Hotz
ca0dca35f7
move ptx renderer [pr] ( #7118 )
2024-10-17 14:50:32 +08:00
George Hotz
d990a16326
fix tests to use render ( #7116 )
2024-10-17 14:35:22 +08:00
George Hotz
9f4ca88218
hotfix: relax target pct for beautiful_mnist
2024-10-17 12:36:07 +08:00
chenyu
51cd0e7c0d
idx_given_valid -> uop_given_valid [pr] ( #7110 )
...
will reuse this to simplify valid independent of idx
2024-10-16 18:16:36 -04:00
chenyu
842fe444df
test case for valid only simplification ( #7108 )
2024-10-16 16:40:46 -04:00
chenyu
9d109c5382
remove outdated symbolic comments ( #7105 )
2024-10-16 14:51:59 -04:00
Francis Lata
90eff347e2
tinytqdm write support ( #6359 )
...
* add write support
* add test
* update test case to compare write outputs
* assert final write output
* flush when using write
* update write logic
* Revert "update write logic"
This reverts commit 5e0e611b46 .
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-16 14:51:41 -04:00
nimlgen
d1094fce5e
amd reports on hang ( #7101 )
2024-10-16 21:32:44 +03:00
nimlgen
39ab67e9ef
beam capture and replay in fuzz ( #7099 )
...
* beam capture and reply in fuzz
* clean a bit
2024-10-16 20:26:58 +03:00
George Hotz
eac58eaaba
no SIGALRM on windows [pr] ( #7104 )
2024-10-17 00:21:04 +08:00