chenyu
e33efd6a3d
test cases for multitensor adds const ( #4892 )
...
Tested const remained const in ast. Removed the TODO in _to_const_val too
2024-06-08 22:57:48 -04:00
chenyu
a3ec4234df
expand broadcast functions a bit ( #4891 )
...
taking some good stuff from the #4886 . I think `from_, to` is more readble than `sh, s` too
[run_process_replay]
2024-06-08 20:16:54 -04:00
wozeparrot
2849d0a2a1
fix copying to clipboard on a non secure context ( #4890 )
2024-06-08 16:51:47 -07:00
nimlgen
6327b50e51
amd in benchmarks ( #4861 )
...
* amd in benchmarks
* remove all hsa
2024-06-08 23:24:46 +03:00
nimlgen
d24e57c615
amd support kernel with bf16 ( #4863 )
...
* amd support kernels with dispatch_ptr
* fixes
* line savings
* one line
* try
* Revert "try"
This reverts commit 5f340dfdd4 .
* not used will be back when hsa is gone
* gone will be back
* add this as well
2024-06-08 22:52:32 +03:00
wozeparrot
6c24eda522
feat: tinychat ( #4869 )
2024-06-08 12:05:45 -07:00
Brennan Kinney
9445946cae
docs: Update referenced yaml in yolov8.py ( #4871 )
...
YAML files have since been relocated.
2024-06-08 15:05:00 -04:00
Roelof van Dijk
794fecf8e3
perf: faster element deletion during matching ( #4882 )
...
* perf: faster deletion
* fix: leave the tuple init
2024-06-08 15:16:35 +02:00
Roelof van Dijk
0eebb8e998
fix: _free should not return ( #4880 )
2024-06-08 14:45:06 +02:00
Roelof van Dijk
1785a70e77
fix: else-return on runtime ( #4881 )
...
* fix: add init file
* fix: no else-return
* fix: remove file again
2024-06-08 14:44:24 +02:00
qazal
1e3325f369
raise assert [run_process_replay] ( #4879 )
2024-06-08 08:31:44 -04:00
qazal
d19f39d4dd
unbind Variable pre LazyOp ( #4873 )
...
* early unbind
* assert ConstType is correct
2024-06-08 08:16:38 -04:00
George Hotz
9c30889ce9
[run_process_replay] faster and simpler match function ( #4876 )
2024-06-08 14:08:30 +02:00
Roelof van Dijk
aadab3e3da
fix: pylint will not lint folders without __init__.py ( #4875 )
...
* fix: add __init__.py
* fix: no-else-return
* fix: redefined-builtin
* fix: unused-variable
* fix: possibly-used-before-assignment
2024-06-08 14:00:24 +02:00
Szymon Ożóg
1680a4bcb8
Remove unused and internal variables ( #4862 )
2024-06-07 23:05:38 +02:00
Roelof van Dijk
15e5a4fb26
fix: variable defined in assert breaks -O ( #4866 )
2024-06-07 21:36:24 +03:00
chenyu
3a20cff7c2
expand ShapeTracker.invert a bit ( #4864 )
...
removed a type cast and it can early return now
[run_process_replay]
2024-06-07 14:26:02 -04:00
nimlgen
688b14c933
do not sleep immediately in amd's wait_signal ( #4859 )
...
* that was slow python in hlb
* wait actibely for 5s
* just this
* revert this back
* fix
2024-06-07 16:33:46 +03:00
qazal
66dfd5e7bf
faster codegen process replay ( #4858 )
...
* faster codegen process replay
* use self.copy
* regenerate
* delete copy
* test a real error [run_process_replay]
* revert the error change
2024-06-07 16:20:57 +03:00
chenyu
dd5378378b
cleanup kernel simplify_merge_adjacent ( #4852 )
...
cleanup kernel simplify_merge_adjacent
2024-06-06 12:04:54 -04:00
nimlgen
47bfd7c2b7
fix sync of offset buffers in graphs ( #4850 )
...
* correctly sync offset buffers
* test
* style
* run less
* just use base
2024-06-06 16:09:45 +03:00
qazal
eeb5a7af39
refactor linearize to render_block, P1 ( #4839 )
...
* refactor to render_block
* move rendering the reduce to its own thing
* add todo and cleanups [run_process_replay]
* inplace update of idxs [run_process_replay]
2024-06-06 15:31:43 +03:00
George Hotz
b932ce0f1d
[run_process_replay] style: clean up UPat
2024-06-06 08:54:24 +02:00
chenyu
b42f49b506
minor cleanup of view _merge_dims ( #4849 )
2024-06-05 23:20:26 -04:00
nimlgen
1649c21ead
nv fix round of allocation sizes ( #4828 )
...
* fix round of allocation sizes
* comment on prefetch
* use huge pages
2024-06-06 00:21:56 +03:00
nimlgen
09bfb8c10a
nv sync program copies to other exection ( #4845 )
2024-06-05 23:34:33 +03:00
chenyu
99e7a1d5e9
support symbolic reshape with non-contiguous ( #4844 )
...
* support symbolic reshape with non-contiguous
pre-requisite for symbolic arange (make symbolic ones that can be folded).
* test cases
* typo
* shorter
2024-06-05 16:01:19 -04:00
chenyu
a352b6d9ce
symbolic Tensor.var ( #4843 )
...
taken from #4446 and add more tests
2024-06-05 12:55:54 -04:00
Nik
085c0bbf6b
add mlperf train subset of openimages ( #4841 )
2024-06-05 10:10:11 -04:00
Timmy
887643cf34
Multireduce atomic local load/store test ( #4786 )
...
* atomic load/store test
* tests for nested & unrolled
* check barriers
* linters
* cleaning up diff
* fix assert in _temp_create_multireduce_ast changes
* cleaning up the check for redundant barriers
* minor cleanups for the assert
* always seed randn, helps with debuggability
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2024-06-05 14:41:19 +03:00
George Hotz
3954f102aa
style: make __init__ first in Tensor class
2024-06-05 12:51:41 +02:00
Szymon Ożóg
273945df67
Regression tests for bitshift ( #4829 )
...
* Regression tests for bitshift
* Add test for bitshift not triggered
* Enable tests
2024-06-05 11:42:34 +02:00
Alec Chen
5ac30c29d8
Construct UOps patterns using UPat ( #4821 )
...
* Allow UPat pattern definitions
* Convert pattern matcher tests to UPat constructions
* Convert constant_folder patterns to upat constructions
* Convert assembly patterns to upat constructions
* [run_process_replay] Drop UPat.from_dict
2024-06-05 10:29:37 +02:00
Szymon Ożóg
e47277d18a
Disable for PTX as well ( #4838 )
...
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com >
2024-06-05 10:37:59 +03:00
Francis Lam
890e7c12bb
test/external/verify_kernel: add support for single pickled kernel ( #4836 )
2024-06-04 18:59:21 -04:00
Elias Wahl
e576aca044
Disable dropout ( #4837 )
2024-06-04 18:57:26 -04:00
Elias Wahl
bb248a0dd1
Optional half matmul ( #4835 )
...
* half linear
* move weight cast back
* oops
* matmul dtype var
* todo comment
2024-06-04 17:53:41 -04:00
Elias Wahl
04e237328b
Refactor to class style ( #4804 )
2024-06-04 14:08:31 -07:00
nimlgen
1b8bed4a26
nv check cmdq overrun ( #4824 )
...
* nv check cmdq overrun
* fix assert
2024-06-04 23:22:58 +03:00
David Hou
cddce0e168
don't cast before view on shape changing bitcast ( #4833 )
...
* don't cast before view on shape changing bitcast
* make sure cast before view triggers
2024-06-04 16:04:52 -04:00
Alec Chen
0c3a996e64
Nest ifs for dtype and uop in pattern matcher ( #4834 )
2024-06-04 15:51:28 -04:00
Alec Chen
4909a0d16f
Fix arg set in pattern matcher ( #4830 )
2024-06-04 15:10:09 -04:00
Alec Chen
c96026ac65
Add arg set regression test for pattern matcher ( #4827 )
...
* Add arg set regression test for pattern matcher
* real regression
---------
Co-authored-by: qazalin <qazal.software@gmail.com >
2024-06-04 13:35:09 -04:00
chenyu
a70e8a80d7
test_ops test cmp with special floats ( #4826 )
...
prepare to fix nan, it did not work with ge and le before either
2024-06-04 12:10:21 -04:00
Szymon Ożóg
b6895dabaa
Remove ssa label ( #4823 )
...
* remove ssa label
* linting
2024-06-04 16:51:05 +02:00
George Hotz
052c928d06
hotfix: touchups from presentation
2024-06-04 16:31:03 +02:00
chenyu
1e02b4cae1
default skip all exception in beam ( #4822 )
...
added a flag `BEAM_STRICT_MODE` to catch compile error or other exceptions on demand
2024-06-03 18:21:36 -04:00
chenyu
3afc914617
CMPEQ -> CMPNE and make it safe to pad ( #4818 )
...
* CMPNE
* new dataset
2024-06-03 18:02:15 -04:00
qazal
79c7d402ee
improve augmented assign error message ( #4813 )
2024-06-03 16:57:22 -04:00
Szymon Ożóg
bb7b031c5c
Bitshift ( #4728 )
...
* WIP
* Cleanup
* Cleanup
* Fix variable, refactor to use set
* right shift should be signed/unsigned
* Test for bitshifts
* Allow a neg
2024-06-03 21:16:01 +02:00