Commit Graph

3082 Commits

Author SHA1 Message Date
geohotstan
0398288b79 Getitem round3 .... (#2760)
* refactor round 3

* comment

* oops

* oops

* oops2

* factored out multiple condition

* add a comment for type

* wooaah roundup is cool, thanks chenyu lol

* add another walrus for symmetry and some spaces

* lol wtf useless listcompre
2023-12-14 12:22:37 -05:00
chenyu
0ae22b0f81 restore Tensor.default_type in test_hip_rdna3 (#2763)
might cause flaky tests
2023-12-14 11:35:38 -05:00
qazal
746cb5de21 Test coverage for matvec (#2762)
* add test coverage for matvec

* skip devices that don't support locals
2023-12-14 11:34:56 -05:00
chenyu
64fea9ff4a Revert "minor onnx_op cleanups to prep dtype changes (#2758)" (#2759)
This reverts commit 38da001b64.
2023-12-14 03:12:14 -05:00
chenyu
38da001b64 minor onnx_op cleanups to prep dtype changes (#2758)
read through it and clean some minor stuff
2023-12-14 03:05:59 -05:00
jaredeh
d8952fc575 updating to work with new internal apis (#2755) 2023-12-13 21:54:47 -08:00
chenyu
2c6814ba28 insert_before is None means insert at the end (#2757) 2023-12-13 21:05:10 -05:00
chenyu
aad005e220 set default str for CStyleLanguage.arg_int_prefix (#2756)
it's the same `const int` for clang, opencl, cuda and hip
metal overwrites with `constant int&` and webgl has its own thing
2023-12-13 20:23:27 -05:00
chenyu
107dd8f3d7 fix a typo in test_dtype_alu (#2754) 2023-12-13 19:23:21 -05:00
chenyu
fc6bca7ba8 update type annotation of _broadcasted (#2753)
input can be Tensor, float, int.
also updated scaled_dot_product_attention that might add a None to a Tensor
2023-12-13 19:03:14 -05:00
Maksym Sobolyev
bf4165ccac Fix double exception in __del__() when __init__() raises exception. (#2738) 2023-12-13 15:46:11 -08:00
chenyu
81a747fc63 more test cases in test_slice_fancy_indexing_with_idx (#2751) 2023-12-13 17:52:26 -05:00
chenyu
22feb7330e simplify fancy index with negative Tensor entries (#2749) 2023-12-13 14:45:50 -05:00
chenyu
b229879613 refactor _broadcasted (#2747)
also moved the expand noop check to .expand.
2023-12-13 13:36:25 -05:00
George Hotz
7e5b3e53fe changes to prep for new lazy (#2748)
* changes to prep for new lazy

* put those back
2023-12-13 10:28:22 -08:00
Umut Zengin
8ad7cfeeb1 More simplification in to_image_idx and symbolic (#2679)
* less valid

* add test

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2023-12-13 12:30:44 -05:00
Ahmed Harmouche
e7248b677c Remove wgsl custom render_for (#2729)
* Generic for

* remove custom render_if

* Simplify for loop

* 150 line-length constraint

* Put custom render_if back
2023-12-13 09:04:17 -08:00
tomtom-95
6b0f07e94a add decorator to preserve info about original function (#2743) 2023-12-13 09:03:50 -08:00
chenyu
aa4a0de287 simpler Tensor.pow to integer (#2746) 2023-12-13 11:39:20 -05:00
chenyu
26f49869f4 minor tensor type annotation and cleanup (#2742) 2023-12-13 01:53:59 -05:00
chenyu
2ef33abd20 some unary functions cast int input into float (#2740)
* some unary functions cast int input into float

* precision

* image dtype
2023-12-13 00:10:29 -05:00
George Hotz
3e778fcc52 hotfix: *** 2023-12-12 19:44:31 -08:00
Shawn Hagler
51afe938f1 update onnx model links (#2737) 2023-12-12 19:11:11 -08:00
George Hotz
431fae5ed3 hotfix: update_stats cleanup, yellow is nicer than red 2023-12-12 17:50:22 -08:00
chenyu
0869e7a301 update onnx benchmark urls (#2735)
onnx is remapping the models, old ones are in archive/
2023-12-12 20:46:01 -05:00
George Hotz
6d6eb9302d ruff checks the max line length is 150 (#2734)
* ruff checks the max line length is 150

* fix tensor.py

* a lot more

* done
2023-12-12 17:34:47 -08:00
George Hotz
3635540ddb shorter line (#2733) 2023-12-12 15:34:17 -08:00
nimlgen
ede7971ada save some lines (#2731)
* remove unsused mem_cached var

* one more
2023-12-12 15:26:27 -08:00
chenyu
00b611c156 simplify type promotion - remove weak types (#2730) 2023-12-12 16:12:57 -05:00
Nguyen Nguyen Phuong
07cf45e133 fix cuda matmul (#2725) 2023-12-12 07:59:31 -08:00
chenyu
ef6e942a23 dtype promotion helpers (#2724)
* dtype promotion helpers

* better tests

* space
2023-12-11 23:14:23 -05:00
Christopher Mauri Milan
0232db294d fix tolist issue (#2723) 2023-12-11 19:14:00 -08:00
chenyu
4075208127 some dtype creation spec test cases (#2722) 2023-12-11 19:33:49 -05:00
Guy Leroy
ee9e1d3662 Extend available types for safe_save (#2720)
* Extend available types to save with

* Linter fix
2023-12-11 14:50:35 -08:00
George Hotz
b5fd160b39 hotfix: increase rtol on simple_matmul 2023-12-11 10:10:29 -08:00
Gregor Kikelj
4feaaa27aa ensure shrink is valid (#2717) 2023-12-11 09:58:43 -08:00
qazal
a43bc78804 fix dtypes helpers for integers (#2716)
* scalar

* maybe do this instead

* Revert "scalar"

everything is a scalar

* add tests in test_dtype

* fuzz testing + fix unsigned ints

* fuzz everything
2023-12-11 09:28:19 -08:00
nimlgen
bc3c4ce50b cuda set context before sync (#2715)
* cuda set context before sync

* no helper
2023-12-11 09:26:53 -08:00
Ivan Vnučec
8d206f6bfd fix help message (#2705)
llama -> mixtral
2023-12-10 22:04:35 -08:00
George Hotz
59ab3675a3 faster mixtral + green for new kernels (#2701)
* green for new kernels

* track ram
2023-12-10 19:04:58 -08:00
chenyu
2ee6f689c5 simpler einsum (#2700) 2023-12-10 21:24:44 -05:00
George Hotz
b01e3907a1 mixtral touch up: two lines 2023-12-10 17:21:49 -08:00
George Hotz
b3982187d1 Mixtral Example (#2691)
* mixtral

* simpler

* global counters

* simpler

* weights arg
2023-12-10 17:18:31 -08:00
George Hotz
0fd44259cd bf16 fix + cleanups from mixtral (#2698)
* bf16 fix + cleanups from mixtral

* generic bf16 cast
2023-12-10 16:31:52 -08:00
Davi Silva
7fbebb3df6 Implement einsum (#2686)
* hopeful impl for Tensor.einsum

* satisfy mypy by having less typing. :(

* a few simple tests

* even more tests

* permute tests

* xfails for improper usage

* fix LLVM test fail

* use argfix

* more helpful error message on shape mismatch
2023-12-10 15:56:01 -08:00
chenyu
181b0970b5 slightly better extra/to_movement_ops dedups (#2695) 2023-12-10 11:05:44 -05:00
chenyu
ef18d79faa remove noop from to_movement_ops (#2693) 2023-12-10 00:50:24 -05:00
chenyu
2d0e38e201 fix jit input_rawbuffers check wrt consts (#2689)
* fix jit input_rawbuffers check wrt consts

* .numpy()
2023-12-09 15:59:03 -05:00
geohotstan
67ff2b2b18 Formatted test_indexing (#2688)
* added tensor.clone() for more correct cloning behavior

* some work and randint issue

* formatted

* final cleanups

* oops, bug fix
2023-12-09 11:38:36 -05:00
chenyu
1e7823e1f5 combine GROUP and GROUPTOP to a single block (#2687) 2023-12-09 01:19:32 -05:00