George Hotz
74ee9febec
remove iter from uopgraph ( #6110 )
...
* remove iter from uopgraph
* linearize returns uops
* fix tests
* linearize in linearize
* tests fix
* touchup
* test failures
2024-08-16 15:58:29 -07:00
qazal
28c75bf2a6
merge uops with ops ( #6111 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-08-16 18:17:57 -04:00
qazal
c23d44c779
AST is UOp ( #6030 )
...
* most of the work from the uops2 branch
* schedule
* realize
* kernel
* lowerer
* search
* green
* merge uops with ops
* Revert "merge uops with ops"
This reverts commit 1408a59f12 .
* fix benchmark
* remove extra dedup
2024-08-16 22:09:00 +03:00
kormann
2c4add6844
pretty print lazy op per default ( #5505 )
...
* pretty lop
* min diff
* walrus
* fix
* min diff
* simplify
* pretty helper function
* ws
* pretty uop upat
* tests
* stricter tests
* test passes
* ws
* stronger upat test
* delete print_tree
* min diff
* stricter exp test
* fix merge
* stronger uops eval test
* +readable and deep upat test
* +readable and deep upat test
* sort inv fix
* fix
* revert allowed_len
2024-07-18 09:34:08 -07:00
Francis Lam
2d53abb04a
test/external/fuzz_linearizer: fix for new AST changes ( #5519 )
...
* test/external/fuzz_linearizer: fix for new AST changes
also add beautiful_mnist failures
* add CLANG and LLVM to test_failure_35 failed_platforms
* fix test_linearizer_failure names
2024-07-17 00:08:07 -04:00
chenyu
28972418c4
s/get_linearizer/get_kernel [run_process_replay] ( #5467 )
2024-07-13 20:32:22 -04:00
George Hotz
03c2dc8bd7
lowerer is kernel [run_process_replay] ( #5437 )
2024-07-12 18:50:55 -07:00
George Hotz
870dc8c350
s/Linearizer/Lowerer [run_process_replay] ( #5428 )
2024-07-12 15:54:07 -07:00
George Hotz
94599c0637
fixup ast in kernel to be MetaOps.SINK [run_process_replay] ( #5424 )
...
* fixup ast in kernel to be MetaOps.SINK [run_process_replay]
* fix tests
* fix more tests
2024-07-12 14:01:03 -07:00
George Hotz
6f6b3b10c9
import from uops, not linearizer ( #5064 )
2024-06-20 08:08:44 -07:00
kormann
7c3b877216
rename uop [run_process_replay] ( #5031 )
...
* rename
* fix unittests
* rename vin
* fix test
* fix type [run_process_replay]
* rm pre commit hook change
2024-06-18 21:34:05 +03:00
chenyu
67e8df4969
remove numpy from dtype ( #4969 )
...
replaced all dtype.np with _to_np_dtype defined in tensor.py.
after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer
2024-06-14 15:38:45 -04:00
chenyu
fdbb4305cb
skip unsupported dtype in fuzz_linearizer ( #4917 )
...
resolve issues in #4887 . dataset generated from ubuntu but metal does not support double
2024-06-11 18:18:21 -04:00
George Hotz
ff64bcab69
move graph/search to engine ( #4596 )
2024-05-14 23:12:59 -07:00
George Hotz
2f970a4fc2
all realize 2 ( #4527 )
...
* all realize 2
* tests fixup
* fix more tests
* fix openpilot
* fix tests
* unneeded
2024-05-10 22:43:09 -07:00
George Hotz
1e843d495e
cleaning up search with Program ( #4500 )
...
* cleaning up search
* fix tests
* test fix
* minor compiler cleanup
2024-05-09 19:01:53 -07:00
Francis Lam
7da1b41f38
fuzz_linearizer: add FUZZ_REQUIRE_TC option to require TC in opts ( #4468 )
...
useful for checking late opts after TC such as GROUP, etc.
2024-05-07 17:14:21 -04:00
Francis Lam
18c61ce077
test/fuzz_linearizer: add --atol/rtol and change half distribution ( #4352 )
2024-04-29 15:53:59 -04:00
George Hotz
b9570d6100
clean up update stats ( #4226 )
...
* WIP: clean up update stats
* line savings now
* fix graphs
* fix tests
* tighter prints
* remove extra jit=false
* debug=2 means wait
* that won't update stats
* still wait
2024-04-19 15:41:30 +04:00
chenyu
d9ff636cf5
use is to compare with enum ( #3993 )
...
* use is to compare with enum
currently it's mixed between `==` and `is`, moved all to `is`
* more
2024-03-29 13:02:56 -04:00
George Hotz
42b9d999ea
Buffer isn't always allocated ( #3974 )
...
* buffer alloc
* allocate
* missing allocates
* last one
2024-03-28 13:33:47 -07:00
Francis Lam
5530b0cbed
fuzz_linearizer: reduce debug verbosity and make easier for CI usage ( #3942 )
...
* fuzz_linearizer: reduce debug verbosity and make easier for CI usage
* rename FUZZ_BEAM to FUZZ_ALL_ACTIONS (not choosing a subset)
* skip simple ASTs (easier to use with LOGOPS output)
* don't fuzz a previously seen AST
* add options to allow non-zero --expected-failures
* clean up naming and use set
2024-03-26 16:25:24 -04:00
chenyu
a2b2597fc2
replace dtype.name str with render_dtype ( #3903 )
...
fixed some bf16 cast issue since it does not have `.name`.
also more robust if there are lang specific type override
2024-03-23 19:25:48 -04:00
Francis Lam
5587594a00
fuzz_linearizer: add --ast and --file params to read kernels ( #3877 )
...
also fix up ast_str_to_str to support the new tuple of LazyOps
2024-03-22 14:27:40 -04:00
Francis Lam
3c0478bfab
fuzz_linearizer: add additional DEBUG info for comparison errors ( #3866 )
2024-03-21 18:58:10 -04:00
chenyu
e50b7abe4f
diversed buf inputs based on dtype in fuzz_linearizer ( #3863 )
2024-03-21 16:23:11 -04:00
chenyu
30fa03243e
reuse fuzz_linearizer.compare_linearizer in test_linearizer_failures ( #3861 )
2024-03-21 14:12:27 -04:00
chenyu
6bf0b82267
alloc new output in fuzz_linearizer between baseline and real one ( #3859 )
...
if the kernel is an assign `a += 1`, the rawbufs[0] is updated twice and gives false compare_error
2024-03-21 11:36:05 -04:00
Francis Lam
6d5dec2fef
log optimized kernels and a script to compare with non-optimized ones ( #3829 )
...
* search: add BEAM_VERIFY option to validate search results
refactor fuzz_linearizer comparison to allow it to be used in for
BEAM_VERIFY in device.py
* search: fix to verify the beam_search result and not the fastest
* search: fix typing and clean up
* device: remove imports from test and add LOGKERN options
LOGKERN output can be used with test/external/verify_kernel.py
to validate correctness
* fix example in verify_kernel.py
* cleanup fixes
* fix to use f-strings
2024-03-20 19:22:08 -04:00
qazal
e3e89c244b
multioutput uoping infra ( #3706 )
...
* linearize multioutput
* add vars to copy
2024-03-15 21:56:59 -07:00
nimlgen
08064a0e29
add SEED env to fuzz_linearizer ( #3713 )
...
* add SEED env to test/external/fuzz_linearizer.py
* found some
* more platforms
2024-03-13 18:08:42 +03:00
chenyu
e25879d50e
don't get new var_val for the same ast in fuzz_linearizer ( #3657 )
...
fixed result comparison for kernels with variables
2024-03-08 09:49:24 -05:00
chenyu
1130c73844
add FUZZ_NTH to fuzz_linearizer ( #3656 )
...
* add FUZZ_NTH to fuzz_linearizer
also update tests in test_linearizer_failures to not just run on METAL
* update failures for HIP/HSA
* test_failure_21 LLVM PADTO
2024-03-08 09:16:49 -05:00
chenyu
57df8e8d82
update fuzz_linearizer ( #3648 )
...
included non-reduce kernel and kernel with variables. green msg when everything passed
it's possible that creating rawbufs failed due to memory error, included that in failure cases
2024-03-07 18:41:22 -05:00
Francis Lam
162dfb07d9
fuzz_linearizer: fix uops and add to test.yml ( #3588 )
2024-03-02 15:03:42 -08:00
Francis Lam
11da65bccd
test/external/fuzz_linearizer: add a FUZZ_MAX_SIZE option ( #3455 )
...
* test/external/fuzz_linearizer: add a FUZZ_MAX_SIZE option
this allows us to limit the size of the kernel and reduce running
times by avoiding ones that take a long time
* fix spacing and re-order to put parameters together
2024-02-27 07:34:59 -05:00
xarkes
28a8b72024
Remove Interpreted device & remaining CPU/TORCH ref ( #3423 )
...
* Remove Interpreted device & remaining CPU/TORCH ref
* Oops
* supports_device was useful
* Fix doc wording
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-02-16 00:30:21 -05:00
George Hotz
41efaa848c
move graph.py and jit.py into features ( #3376 )
...
* move graph.py into features
* move jit into features
* fix quickstart
2024-02-12 17:34:34 +01:00
Francis Lam
2266152b28
linearizer: added FUZZ_BEAM to fuzz_linearizer and additional tests ( #3340 )
...
Fixed test_tensor_core_opts to test all the TCs.
Added commented out failing tests in test_color_shapes_with_local.
2024-02-08 16:12:58 +01:00
nimlgen
f87ecbb0f3
fuzzer validates outputs + (partially) oob accesses ( #3178 )
...
* fuzzer validates outputs + (partially) oob accesses
* +random
* oob check only for compiled
* type cmp fixes
* fix zeroing
* no prints
* add seed
2024-01-19 13:34:51 -05:00
chenyu
1b508e0f71
fix fuzz_linearizer toCPU to as_buffer ( #3158 )
2024-01-17 13:18:46 -05:00
chenyu
58d3d5030b
vars_from_ast -> LazyOp.vars ( #2965 )
2024-01-01 18:12:38 -05:00
George Hotz
56f44bd10e
move the compiler cache to be global ( #2957 )
...
* move the compiler cache to be global
* remove non robust test
* remove dead code
2024-01-01 10:59:56 -08:00
George Hotz
00d9eda961
FROM -> COPY, move vars_from_ast ( #2675 )
2023-12-07 16:32:30 -08:00
chenyu
51af99367f
fix fuzz_linearizer using new device Buffer ( #2674 )
2023-12-07 19:21:47 -05:00
Christopher Mauri Milan
7f01dd04f0
Apply ruff linting rules to tests ( #2473 )
...
* everything except F821
* enable F821 with noqa
* dumb fix
* fix remaining imports and (former) lambdas
* replace _ with noqa to avoid gc
2023-11-27 21:24:06 -08:00
George Hotz
9e07824542
move device to device.py ( #2466 )
...
* move device to device.py
* pylint test --disable R,C,W,E --enable E0611
* fix tests
2023-11-27 11:34:37 -08:00
George Hotz
8e9cdef61f
clean up the buffers ( #2447 )
...
* clean up the buffers
* remove allocate_output
* functools.lru_cache is methodcache
* add TestShapeTrackerSize
* cache_clear
* no 0 sz buffer, add _ on functions that shouldn't be imported
* fix size
* if -> while
2023-11-26 11:02:29 -08:00
George Hotz
0505c5ea50
remove force_wait, refactor to graph ( #2405 )
...
* remove force_wait
* refactor
* get rid of stupid ASTRunner
* fix del in diskbuffer
* BufferOps.FROM_UNDERLYING
* put offset in the rawbuffer
* fix bugs
* use exec
2023-11-23 12:46:07 -08:00
chenyu
a98511561c
fuzz_linearizer same api for interpreted and compiled ( #2320 )
2023-11-15 17:40:22 -05:00