Commit Graph

10417 Commits

Author SHA1 Message Date
George Hotz
64dded27f0 pad ops broke coder (#2881)
* pad ops broke coder

* that contiguous fixes it

* Update lazy.py
2023-12-20 17:03:41 -08:00
George Hotz
e1861ab65e remove realize from optimizer (#2880)
* remove realize from optimizer

* one still needed

* opt realize
2023-12-20 16:42:41 -08:00
George Hotz
1765849937 new lazy, benchmark (#2878)
* lazy rewrite, try 2

* min fix tests

* pass contig test

* put broken pads back

* move that to realize

* no contig child fixes array packing

* so wrong

* now that's correct

* base children

* fix bind issues

* disable to_image_idx

* fix tests

* that failure shouldn't break other tests

* more fixes

* fix torch

* skip failing tests in CI

* 1e-7

* half is broken

* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
Peter Cawley
dae8976889 Fix reshape merging with masks (#2877) 2023-12-20 14:00:58 -08:00
George Hotz
8fe24038d8 Revert "mulacc fusion cleanup (#2871)" (#2876)
This reverts commit 863c5b26ed.
2023-12-20 13:26:25 -08:00
qazal
863c5b26ed mulacc fusion cleanup (#2871)
* add mulacc fusion tests

* cleanup the implementation

* fix indent in the test utility

* less verbose
2023-12-20 15:39:54 -05:00
chenyu
e13b4964d7 remove the all_int(shape) check in Tensor._loadop (#2874)
* remove the all_int(shape) check in Tensor._loadop

we can support jittable symbolic shape random with custom rand now, and we can formalize it in the test after threefry is ready

* MOCKHIP false positive
2023-12-20 15:04:50 -05:00
qazal
5f07ef455e update dtypes (#2872) 2023-12-20 15:04:02 -05:00
chenyu
857c35d256 make gpt2 decode output just once at the end (#2869)
also updated function name from greedy_until to generate, as it's not greedy nor until
2023-12-20 12:14:55 -05:00
chenyu
e92069fb1c remove unused symbolic.is_sym_int (#2868) 2023-12-20 11:37:54 -05:00
George Hotz
ca59054463 fix shapetracker math (#2861)
* proper test

* all st math good now

* fix real_strides bug
2023-12-19 22:17:34 -08:00
chenyu
5a739e8c20 update one skipped pad_reshape test that was fine (#2860)
* update one skipped pad_reshape test that was fine

had a typo

* this one passed
2023-12-19 23:25:52 -05:00
chenyu
39af93ed7c minor tensor.py function cleanup (#2859)
* minor tensor.py function cleanup

* where outputs not aligned yet
2023-12-19 22:39:39 -05:00
George Hotz
94f71fe238 random and empty shouldn't reshape 2023-12-19 18:09:03 -08:00
George Hotz
637879af78 add direct install to readme 2023-12-19 18:04:00 -08:00
chenyu
ad233d557f disable reshape merging with masks (#2858)
fuzzer found a bug, and it's not complete
2023-12-19 19:06:16 -05:00
chenyu
1231ec5a02 run the sz.py line count at the end of linter ci (#2857) 2023-12-19 16:33:12 -05:00
George Hotz
ac6ec936cd update contributing 2023-12-19 12:19:14 -08:00
George Hotz
e477cc2f45 hotfix: README is ~25 ops to stop getting PRs about it 2023-12-19 11:53:35 -08:00
Oleg Rybalko
42a038c83f More readable torch_load ext check (#2853)
* more readable extension check

* enable tarfile test

* detach tensor if requires grad in torch
2023-12-19 14:53:15 -05:00
chenyu
172a88e719 skip slow test_indexing on METAL (#2852)
LLVM still runs and is a lot faster, would be curious to know why.
also reworded some error messages and remove regex check
2023-12-19 12:00:54 -05:00
chenyu
6d7e9e0a56 hotfix convert Y_train to int before passing into index (#2850) 2023-12-19 11:40:56 -05:00
geohotstan
fec8e9060c Add simple fancy indexing exceptions (#2706)
* fancy indexing raise error

* updated error message

* improved error check

* oops

* fixed onnx

* oops typo

* merge

* add full_flatten

* try

* merged and updated some tests

* more cleaning

* done

* temp fix onnx

* try

* add todo in onnx_test

* reword

* gah
2023-12-19 11:23:51 -05:00
qazal
417d42a363 UOps support for ImageDType (#2848)
* cleanup buffer dtypes in global_load

* update with feedback
2023-12-19 09:39:48 -05:00
George Hotz
90fb09b55c remove unused _device_extra_args 2023-12-18 22:14:58 -08:00
George Hotz
b2192b5400 minor improvements (#2845) 2023-12-18 22:09:08 -08:00
George Hotz
d086325b1b hotfix: failing tests 2023-12-18 21:12:42 -08:00
George Hotz
07df14aa0e HIP cleanups (#2843)
* move everything to code_for_op to reason about it

* loop the loopable parts

* its not that unreadable

* these are loopable too

* nitpick

* tests p1 - replace these with the actual compiler running alu ops tests

* tests p2: compile test_dtype_alu in HIP!

+add to CI

* nobody liked test_renderer

* revert test_dtypes change

* isolated mockhip tests

* dont need the WHERE hack after #2782

+ruff

* bf16 is broken in HIP

job failed in: https://github.com/tinygrad/tinygrad/actions/runs/7232101987/job/19705951290?pr=2778#step:8:73

* picking this back up

* add compile tests for unary ops and binary ops

* MOD is only in ints

* CMPLT wont work after the dtypes pr is merged because it will always be bool

* test all combinations

* Update cstyle.py

* don't use vload

* no getenv

* set seed

---------

Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2023-12-18 21:09:32 -08:00
George Hotz
b6d71b131e hotfix: push broken tests 2023-12-18 21:08:42 -08:00
George Hotz
9b35186bbe hotfix: don't import from tinygrad in sz.py 2023-12-18 20:49:46 -08:00
George Hotz
6617dcf095 move graph to runtime, check line count with sz.py (#2842)
* move graph to runtime, check line count with sz.py

* oops, didn't save

* dtype aliases

* restore comment, REALCOUNT
2023-12-18 20:30:06 -08:00
chenyu
15dc5bcfbd remove cstyle hip functions that deals with different types of input (#2840) 2023-12-18 22:43:22 -05:00
chenyu
dad9253d52 minor clean up in kernels (#2832)
clean up some long lines and combined some short lines
2023-12-18 19:35:59 -05:00
George Hotz
954a2fef75 hotfix: add JITGRAPH and invert sints 2023-12-18 16:33:22 -08:00
George Hotz
80f53245e8 shapetracker add and invert (#2828)
* invert (broken)

* decent invert

* shapetracker invert works

* plus is meh, invert is good

* support invert mask

* a few more invert tests

* shapetracker math invert test
2023-12-18 16:03:27 -08:00
chenyu
73cadfbb3c Remove pytest markers (#2831)
* remove pytest marker

* fix some, skip some

* tweak

* fix

* skip slow

* skip more
2023-12-18 18:53:28 -05:00
chenyu
264fe9c93f clean up test_dtype.py (#2827)
make is_dtype_supported a pure function and clean up long lines
2023-12-18 16:06:09 -05:00
chenyu
20ea43b6e7 dtypes.from_py to convert py types to dtypes (#2826)
also updated some tests to test against default dtypes
2023-12-18 14:23:31 -05:00
chenyu
0723f26c80 dtypes.default_float and dtypes.default_int (#2824) 2023-12-18 12:21:44 -05:00
chenyu
8aab19ce3d Tensor.full of bool has dtypes.bool (#2823) 2023-12-18 10:51:17 -05:00
chenyu
220abcd8ff fix squeeze of 0-dim Tensor with negative dim (#2821)
if ndim=0, only accepted dim is 0, -1, None. other negative dim results in IndexError
2023-12-17 22:02:07 -05:00
chenyu
21ec7e09f6 minor cleanup of image (#2820)
use transpose for transpose instead of permute, and use pad for pad instead of slice
2023-12-17 20:53:49 -05:00
chenyu
959d9cfed4 clean up ops_torch and ops_cpu (#2819) 2023-12-17 19:35:19 -05:00
Rory Clear
f409b57854 update metal matmul and matvec for new device style (#2732)
* update for new device style

* create device before compile

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2023-12-17 16:15:07 -05:00
chenyu
91adb119b8 remove match_type in ops_torch and ops_cpu (#2817)
* remove match_type in ops_torch and ops_cpu

input dtypes are aligned and casted in mlops

* dict union only after python3.9

* fix that

* fix Sigmoid forward cast
2023-12-17 15:32:30 -05:00
Maksym Sobolyev
887f3d9933 Make torch backend more usable, fix bfloat support in the llvm backend (#2765)
* Uncripple dtype tests, TestBFloat16DType never actually runs.

* Fix conversion from/to bfloat16.

Call cast() recursively, so that it works for any type combo.

* Run this test on torch backend as well.

* Add torch.bfloat16.

* Add support for ushort and uint.

* Convert np.uint32 to np.int32 when loading.

* Fix warning.
2023-12-17 14:04:26 -05:00
chenyu
9c32474a1f Revert "Revert "Tensor.randint is Tensor.uniform with dtypes.int32 (#2801)" (#2802)" (#2814)
This reverts commit fa84998244.
2023-12-17 12:14:17 -05:00
chenyu
b4fa189c8c Revert "Revert "Make Tensor creation allow multi-dim list of int and bool (#2793)" (#2810)" (#2813)
This reverts commit 71a60762ed.
2023-12-17 11:48:27 -05:00
Marcus Asteborg
1fa4f161fe Update CLProgram to use unsigned long long for event profiling (#2808)
On Windows, the unsigned long type is 32-bit, which is not compatible
with the required data size for event profiling.
2023-12-16 23:48:44 -08:00
chenyu
c333bfcf69 replace a defaultdict counting with Counter in cstyle (#2809)
also removed a questionable Any annotation
2023-12-17 02:44:16 -05:00