qazal
65bbafe3e2
bfs refactors from the big graph branch [pr] ( #7235 )
2024-10-23 23:24:31 +03:00
nimlgen
ea11382087
nv fix shared_memory_size ( #7239 )
2024-10-23 21:59:47 +03:00
qazal
ca7b2658b9
start with a fresh ScheduleItemContext in process_replay [pr] ( #7236 )
2024-10-23 18:01:50 +03:00
qazal
ca6c58527b
dfs append_bufs ( #7224 )
...
* dfs append_bufs
* fix test_linearizer
2024-10-23 17:14:51 +03:00
qazal
aeeb917b6e
mask out writable bufs in runtime access_resources ( #7234 )
2024-10-23 16:13:50 +03:00
qazal
d2b608233a
get outbufs by globals idxs [pr] ( #7233 )
2024-10-23 16:06:35 +03:00
qazal
9a2718b30b
proposal: add UOps.PRELOAD ( #7220 )
2024-10-23 10:23:52 +03:00
qazal
3ce1c69c9c
split to get_realizes [pr] ( #7225 )
2024-10-23 10:22:36 +03:00
chenyu
f890d1cbbd
remove PUSH_PERMUTES from external_test_opt ( #7232 )
...
remove old comments and update kernel count for test_convnext
2024-10-23 00:11:34 -04:00
chenyu
24e2442a89
minor tweak to real_strides [pr] ( #7230 )
...
only graph_rewrite once on idx (sholuld be idempotent), and always rewrite valid. will co-rewrite idx and valid next
2024-10-22 22:05:57 -04:00
chenyu
169cc348fe
move valid related functions to ops.py [pr] ( #7229 )
2024-10-22 21:10:12 -04:00
chenyu
e90bbe6bbc
failed test cases for 3+ views shapetracker strides ( #7226 )
2024-10-22 18:49:13 -04:00
qazal
dae908299e
full_ast_rewrite api with ScheduleItemContext ( #7223 )
2024-10-22 23:17:05 +03:00
qazal
7e36e1d2bb
LAZYCACHE to context var [pr] ( #7222 )
2024-10-22 20:36:06 +03:00
qazal
2083ac0b4c
generic small graph sink -> ScheduleItem pattern matcher [pr] ( #7221 )
2024-10-22 20:20:26 +03:00
qazal
4916095124
compute ScheduleItem writable bufs [pr] ( #7214 )
...
* compute ScheduleItem writable bufs [pr]
* don't cache Buffer
2024-10-22 19:02:29 +03:00
qazal
24ed2ed6c8
refactor to ScheduleItemContext [pr] ( #7217 )
2024-10-22 17:58:06 +03:00
chenyu
7ce12a4b06
fix typing in simplify_valid [pr] ( #7216 )
2024-10-22 10:01:33 -04:00
nimlgen
cef7078c14
nv limit mappings debug ( #7215 )
2024-10-22 16:41:43 +03:00
George Hotz
4013c9848c
don't use tons of memory for tests non CI [pr] ( #7209 )
...
* don't use tons of memory for tests
* fix import and clean up pre-commit
* use pathlib
* no shm on windows
* Revert "use pathlib"
This reverts commit 7c38489820 .
* run pre-commit hooks in test
* ugh, fix later
2024-10-22 15:04:51 +08:00
George Hotz
4438d6a467
Tensor.from_url API [pr] ( #7210 )
...
* Tensor.fetch API [pr]
* update docs
* from_url
2024-10-22 14:54:17 +08:00
George Hotz
be64ac417e
move GGUF test to it's own file [pr] ( #7208 )
...
* move GGUF test to it's own file [pr]
* skip tests if modules aren't installed
2024-10-22 13:24:55 +08:00
George Hotz
ccf4843945
use substitute instead of replace_uop [pr] ( #7207 )
2024-10-22 13:24:38 +08:00
George Hotz
3b4587fbf9
no need to DEFINE_VAR arg sort [pr] ( #7206 )
2024-10-22 12:17:50 +08:00
nimlgen
21acfc39d4
qcom cleanup allocs ( #7200 )
...
* qcom cleanup allocs
* oops
2024-10-21 23:20:15 +03:00
chenyu
f37e6b453b
load_gguf -> gguf_load in doc and test ( #7199 )
2024-10-21 14:03:33 -04:00
chenyu
f93bd9e2b9
ggml_data_to_tensor touchups ( #7196 )
...
* ggml_data_to_tensor touchups
tiny reordering and variable name changes
* return type
* pylint
2024-10-21 13:29:59 -04:00
leopf
815e1a340c
GGUF Cleanup - raise if type is not supported ( #7194 )
...
* raise if ggml type is unsupported
* test raise
2024-10-21 11:32:11 -04:00
qazal
bc9eb324dc
group stores by buffer uops [pr] ( #7190 )
...
* group stores by buffer uops [pr]
* dedup
2024-10-21 18:04:44 +03:00
leopf
87877d7a91
GGUF cleanup ( #7192 )
...
* cleanup
* remove vocab size hard code
2024-10-21 10:44:54 -04:00
chenyu
08a3b97ddc
more generic lt_folding ( #7171 )
...
* more generic lt_folding
instead of checking gcd for all uop, check the gcd of the ones that have const_factor() > 1 and still can simplify if others are smallish
* fixed that stride too
2024-10-21 09:41:02 -04:00
chenyu
abd99bb744
unwrap2 is not used ( #7187 )
2024-10-21 09:40:15 -04:00
qazal
37b829ef0d
track metadata with uops [pr] ( #7188 )
2024-10-21 16:35:46 +03:00
ignaciosica
5551cf6689
add rlshift and rrshift special methods ( #7185 )
2024-10-21 08:37:02 -04:00
qazal
8f375b71c5
post-schedule lazybuf from Buffer [pr] ( #7170 )
2024-10-21 15:11:32 +03:00
qazal
7a9f3dea54
assert a schedule double realize ( #7178 )
...
* assert this
* maybe use lazycache
* Revert "maybe use lazycache"
This reverts commit 7368102906 .
* set enable_cache=True
* assert 1 schedule
2024-10-21 14:16:21 +03:00
George Hotz
31fcccc779
hotfix: flip if order
2024-10-21 17:34:23 +08:00
qazal
6c0c3aff14
keep srcs in all ops ( #7175 )
2024-10-21 12:34:02 +03:00
George Hotz
be1806df47
fast sym infer [pr] ( #7177 )
...
* fast sym infer [pr]
* fix pylint
2024-10-21 17:31:32 +08:00
George Hotz
4af228e9fc
hotfix: pin mypy
2024-10-21 16:22:24 +08:00
leopf
b6d9b276bb
GGUF support ( #7046 )
...
* basic loader, untested
* testing
* remove utils import in test
* q8_0
* q4_1
* end to end testing
* minor cleanup
* fix casting
* moved to state
* move tests
* move dequant to fn
* fix lint elif
* remove gguf from extra
* fix dict union
* q6_k simpler
* naming and spacing
* gpt2-gguf example
* cleanup
* move gguf example
* minor cleanup
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-10-21 16:15:34 +08:00
George Hotz
17e7d8f10e
hotfix: fix sz on windows
2024-10-21 16:02:23 +08:00
ignaciosica
87a1e76745
Refactor hip_bfloat16 cast into uop ( #7143 )
...
* refactor hip_bfloat16 cast into uops
* hotfix: linter issue
* hotfix: comment decorator in test
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-10-21 15:17:14 +08:00
qazal
8074c0ec8f
skip test_bfloat16_unary on AMD ( #7169 )
2024-10-21 01:00:47 +03:00
qazal
713461129b
scheduler ast rewrite reorders from big graph [pr] ( #7168 )
...
* scheduler ast rewrite reorders from big graph [pr]
* update test_uops.py
2024-10-21 00:47:58 +03:00
nimlgen
81349213c0
nv min regs count is 16 ( #7166 )
2024-10-20 20:03:55 +03:00
qazal
1383df95af
track_rewrites by function call [pr] ( #7165 )
...
* named track_rewrites [pr]
* group all of create_schedule_with_vars
2024-10-20 17:45:25 +03:00
chenyu
a9ab7db054
don't raise ValueError in uop_given_valid [pr] ( #7163 )
2024-10-19 20:05:04 -04:00
chenyu
98de58260b
simplify valid itself ( #7112 )
2024-10-19 19:39:25 -04:00
chenyu
f511ad9103
No pyint again ( #7156 )
...
* Revert "bring back pyint (#7150 )"
This reverts commit 37e83ca6fc .
* remove truncate in const folding
* truncate_output=False
2024-10-19 13:48:59 -04:00