chenyu
f5a62a1b42
fix some tests related to JitItem ( #2279 )
2023-11-11 23:00:35 -05:00
George Hotz
2f7aab3d13
move optimize_local_size ( #2221 )
...
* move optimize_local_size
* interpret_ast
2023-11-05 21:00:52 -08:00
nimlgen
1c0449e190
add cache collector ( #1595 )
...
* init cache collector
* add test_cache_collector.py
* switch GlobalCounters.cache to CacheCollector
* init jit models test
* jitted SD
* add debug msg to print loaded bufs count
* moved cache collctor to jit
* clearer SD
* no double device import
2023-08-28 19:59:55 -07:00
George Hotz
a6d842af7a
move device to ops ( #1646 )
...
* move device to ops
* mlops types
* 2 lines
2023-08-23 08:30:17 -07:00
chenyu
ae39cf84ab
Symbolic Shape JIT main PR ( #1353 )
...
* Symbolic Shape JIT
update tests
2 variables symbolic ops, adding more tests
test passing
cleanup
* more test cases
* single flag
* review update
* jit attention one piece
* realize
* symbolic_jit test for cuda
* old artifact
* works with cuda gpu but failed ci
* CUDACPU
2023-08-18 14:39:55 -07:00
George Hotz
20894991ed
good changes from the M1 Tensor Core project ( #730 )
...
* good changes
* working except llvm
* llvm types
* nice acc
* archprobe
* lang.float4
* use self.acc for late acc
* fix store bug
2023-03-29 05:11:02 +04:00