Commit Graph

233 Commits

Author SHA1 Message Date
Francis Lam
162dfb07d9 fuzz_linearizer: fix uops and add to test.yml (#3588) 2024-03-02 15:03:42 -08:00
George Hotz
83530a585f add quick external data select test 2024-03-02 05:38:32 -08:00
chenyu
d89e3c4e08 enable METAL tests now runner is M1 and no fast-math (#3523) 2024-02-28 14:14:23 -05:00
Francis Lam
11da65bccd test/external/fuzz_linearizer: add a FUZZ_MAX_SIZE option (#3455)
* test/external/fuzz_linearizer: add a FUZZ_MAX_SIZE option

this allows us to limit the size of the kernel and reduce running
times by avoiding ones that take a long time

* fix spacing and re-order to put parameters together
2024-02-27 07:34:59 -05:00
chenyu
30f26279c5 add back "CPU" in test_onnx_backend supports_device (#3426)
the onnx tests were all skipped.
2024-02-16 00:49:30 -05:00
xarkes
28a8b72024 Remove Interpreted device & remaining CPU/TORCH ref (#3423)
* Remove Interpreted device & remaining CPU/TORCH ref

* Oops

* supports_device was useful

* Fix doc wording

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-02-16 00:30:21 -05:00
George Hotz
b1c0d8c99d remove cpu and torch backends (#3399)
* remove cpu and torch backends

* don't copy to cpu

* use clang instead of cpu

* multitensor gathers on the first device

* clang is cpu + use default

* fixup

* bugfix
2024-02-15 16:55:39 +01:00
George Hotz
a40df14fef ops_ext to replace cpu import (#3409)
* ops_ext to replace cpu import

* don't allow zero copy with as buffer

* memoryview(bytearray

* reenable test

* fix jit issue
2024-02-15 13:03:42 +01:00
George Hotz
6356474d6d Revert "ops_ext to replace cpu import (#3406)" (#3408)
This reverts commit 91eb93f85a.
2024-02-15 12:16:10 +01:00
George Hotz
91eb93f85a ops_ext to replace cpu import (#3406)
* ops_ext to replace cpu import

* don't allow zero copy with as buffer

* memoryview(bytearray

* reenable test
2024-02-15 12:14:58 +01:00
chenyu
078a2603d5 set metal fast math default to 0 (disabled) (#3370)
* set metal fast math default to 0 (disabled)

It's a correctness fix because we use inf and nan. Let's see how slow it is

* skip failed onnx tests

* tmp DISABLE_COMPILER_CACHE=1 in metal benchmark

* Revert "tmp DISABLE_COMPILER_CACHE=1 in metal benchmark"

This reverts commit 22267df380.
2024-02-14 11:42:33 +01:00
George Hotz
2e60012bcf move create schedule and delete old API (#3377)
* move create schedule and delete old API

* fix test multitensor
2024-02-12 18:10:45 +01:00
George Hotz
41efaa848c move graph.py and jit.py into features (#3376)
* move graph.py into features

* move jit into features

* fix quickstart
2024-02-12 17:34:34 +01:00
chenyu
f798b60338 add METAL_FAST_MATH env var to disable metal fast math (#3369)
* env var METAL_FAST_MATH to disable fastmath for metal

use this to test impact of fast math. might need to disable compiler cache with DISABLE_COMPILER_CACHE

* failed onnx test with fast math

METAL_FAST_MATH=0 DISABLE_COMPILER_CACHE=1 NOOPT=1 python -m pytest -n=auto test/external/external_test_onnx_backend.py -k test_MaxPool3d_stride_padding_cpu
2024-02-11 04:26:09 -05:00
chenyu
c151131d1b update onnx tests that no longer fail on CI (#3353)
was debugging fast math and turned out it passed on CI now. more like a bug in CI
2024-02-08 21:19:00 -05:00
Francis Lam
2266152b28 linearizer: added FUZZ_BEAM to fuzz_linearizer and additional tests (#3340)
Fixed test_tensor_core_opts to test all the TCs.

Added commented out failing tests in test_color_shapes_with_local.
2024-02-08 16:12:58 +01:00
chenyu
30a3288c4a touchup canonicalize empty mask (#3308)
empty list -> None. also added env SEED for fuzz_shapetracker_math
2024-02-03 21:05:10 -05:00
chenyu
7816c3b692 onnx update for trilu and argmax (#3283)
* support 0 in shape for tril and triu

* select_last_index for ArgMax and ArgMin

* pass **kwargs
2024-01-30 18:39:16 -05:00
George Hotz
247a8a2a6c add canonicalization to View.create (#3280)
* Reapply "take merge views from corsix branch" (#3278)

This reverts commit d298916232.

* reintroduce merge views

* update second any

* isinstance -> not

* 25% less same but unequal
2024-01-30 10:26:48 -08:00
George Hotz
d8f6280ffb hotfix: add CHECK_NEQ to fuzz_shapetracker_math 2024-01-30 10:07:54 -08:00
George Hotz
c4d870db0d fix jit realize issue (#3258) 2024-01-26 18:27:35 -08:00
chenyu
bc92c4cc32 onnx Einsum, CumSum, DepthToSpace, SpaceToDepth (#3252)
* onnx Einsum, CumSum, DepthToSpace, SpaceToDepth

Einsum inner product and `...` are not supported

* --durations=20
2024-01-26 10:47:53 -05:00
chenyu
e45ffdb6cf cleanup onnx (#3249)
* add onnx test_reduce_log_sum_exp

* more reuse

* more

* stuff

* good CenterCropPad

* imports

* good ArrayFeatureExtractor

* pretty good Pad

* stuff

* stuff

* onnx.py

* Atan

* pass int8 test

* dtype related

* fastmath stuff

* Resize linear

* fix CI

* move back
2024-01-25 20:39:59 -05:00
nimlgen
f87ecbb0f3 fuzzer validates outputs + (partially) oob accesses (#3178)
* fuzzer validates outputs + (partially) oob accesses

* +random

* oob check only for compiled

* type cmp fixes

* fix zeroing

* no prints

* add seed
2024-01-19 13:34:51 -05:00
chenyu
1b508e0f71 fix fuzz_linearizer toCPU to as_buffer (#3158) 2024-01-17 13:18:46 -05:00
chenyu
537fb8b0b8 separate try except blocks in onnx2torch in model benchmark (#3126)
exceptions can be raised from either model conversion or individual backend failed. openpilot on torch mps works, but does not work with torch cpu.
seperate the expcetion block so that the benchmark can inlcude torch mps for openpilot.
2024-01-15 00:39:33 -05:00
George Hotz
ac3f246c11 cached size (#3060)
* cached size

* simplify simplify

* 0 doesn't have base

* fix test

* cleaner cache

* hmm, metal is flaky on this...might be real(ish) but useless as test

* short circuit reshape/expand properly

* better reshape bypass
2024-01-09 16:37:37 -08:00
George Hotz
2c6f2e899d No extra vars call (#3054)
* remove unused reciprocal

* comment

* remove unneeded call to vars

* free speedup
2024-01-09 09:52:58 -08:00
chenyu
19298e7a3f Device._buffers -> Device._devices (#3052)
backend devices used to be called buffers
2024-01-08 21:30:38 -05:00
George Hotz
2a2d3233d2 add test that the compiler isn't used (#3025)
* add test that the compiler isn't used

* one print_tree

* improve speed with st size cache

* switch to gpt-2
2024-01-05 17:24:01 -08:00
geohotstan
57817028bb removed redundant dtype hacks in onnx_ops (#2939)
* updated most dtype hacks in onnx_ops

* temporarily revert dequantizelinear change

* I think this is right...

* MORE FIXES WOOOO NEW DTYPE IS AWESOME

* ok

* oops missed a print

* half -> float32 for CI

* is npdtype

* some more

* fix if ordering

* more clean ups

* final cleanups

* casting to half not allowed

* k nvm

* revert ArgMax change

* only GPU

* llvm begone

* teeny tiny change

* fix: attempt to add cast tests

* try this

* fix dequantizelinear

* revert some stuff

* tests pass pls

* less lines in onnx_tests

* oops missed string tensor tests

* clean up

* try: revert default behavior changes

* fix: disabled Cast and Castlike tests

* docs: small changes

* fix: fixed isNaN op and enabled associated tests

* fix: forgot about float16

* done

* update disabled test

* gah missed another float16

* disable rest of failing tests

* rm extra line

* try...

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-01-04 01:45:24 -05:00
chenyu
58d3d5030b vars_from_ast -> LazyOp.vars (#2965) 2024-01-01 18:12:38 -05:00
George Hotz
a280cfe169 move dtypes to dtype.py (#2964)
* move dtypes to dtype.py

* fix urllib
2024-01-01 14:58:48 -08:00
George Hotz
c81ce9643d move globalcounters to ops (#2960)
* move globalcounters to ops

* missed a few

* sick of that failing
2024-01-01 14:21:02 -08:00
George Hotz
56f44bd10e move the compiler cache to be global (#2957)
* move the compiler cache to be global

* remove non robust test

* remove dead code
2024-01-01 10:59:56 -08:00
chenyu
1fb815e77e hotfix fix coder. RMSNorm cannot have float16 input (#2932)
* hotfix fix coder. RMSNorm cannot have float16 input

* update real world test due to new kernels

* more type casts
2023-12-25 02:28:11 -05:00
chenyu
50927defad s/lazydata.realized/lazydata.base.realized/g (#2914)
* s/lazydata.realized/lazydata.base.realized/g

* not that
2023-12-22 14:45:13 -05:00
chenyu
50cfb1fb3a update onnx model links (#2908)
updated in https://github.com/onnx/models/pull/644
2023-12-22 00:19:41 -05:00
chenyu
1bbeb3fe2f remove the different rtol / atol for openpilot CUDA in benchmark (#2907)
not sure what the issue was but seems to be fixed on master
2023-12-21 22:23:39 -05:00
chenyu
5bf43c9634 reenable one onnx test failed due to dtype (#2902) 2023-12-21 15:50:02 -05:00
George Hotz
1765849937 new lazy, benchmark (#2878)
* lazy rewrite, try 2

* min fix tests

* pass contig test

* put broken pads back

* move that to realize

* no contig child fixes array packing

* so wrong

* now that's correct

* base children

* fix bind issues

* disable to_image_idx

* fix tests

* that failure shouldn't break other tests

* more fixes

* fix torch

* skip failing tests in CI

* 1e-7

* half is broken

* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
George Hotz
ca59054463 fix shapetracker math (#2861)
* proper test

* all st math good now

* fix real_strides bug
2023-12-19 22:17:34 -08:00
chenyu
ad233d557f disable reshape merging with masks (#2858)
fuzzer found a bug, and it's not complete
2023-12-19 19:06:16 -05:00
geohotstan
fec8e9060c Add simple fancy indexing exceptions (#2706)
* fancy indexing raise error

* updated error message

* improved error check

* oops

* fixed onnx

* oops typo

* merge

* add full_flatten

* try

* merged and updated some tests

* more cleaning

* done

* temp fix onnx

* try

* add todo in onnx_test

* reword

* gah
2023-12-19 11:23:51 -05:00
George Hotz
b2192b5400 minor improvements (#2845) 2023-12-18 22:09:08 -08:00
George Hotz
07df14aa0e HIP cleanups (#2843)
* move everything to code_for_op to reason about it

* loop the loopable parts

* its not that unreadable

* these are loopable too

* nitpick

* tests p1 - replace these with the actual compiler running alu ops tests

* tests p2: compile test_dtype_alu in HIP!

+add to CI

* nobody liked test_renderer

* revert test_dtypes change

* isolated mockhip tests

* dont need the WHERE hack after #2782

+ruff

* bf16 is broken in HIP

job failed in: https://github.com/tinygrad/tinygrad/actions/runs/7232101987/job/19705951290?pr=2778#step:8:73

* picking this back up

* add compile tests for unary ops and binary ops

* MOD is only in ints

* CMPLT wont work after the dtypes pr is merged because it will always be bool

* test all combinations

* Update cstyle.py

* don't use vload

* no getenv

* set seed

---------

Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2023-12-18 21:09:32 -08:00
George Hotz
80f53245e8 shapetracker add and invert (#2828)
* invert (broken)

* decent invert

* shapetracker invert works

* plus is meh, invert is good

* support invert mask

* a few more invert tests

* shapetracker math invert test
2023-12-18 16:03:27 -08:00
chenyu
0723f26c80 dtypes.default_float and dtypes.default_int (#2824) 2023-12-18 12:21:44 -05:00
chenyu
157c0be509 cleanup onnx, pass one more reshape test and remove some casts (#2806) 2023-12-16 20:40:43 -05:00
chenyu
1bc378c3d6 _broadcasted handles the python number types (#2785)
* _broadcasted handles the python number types

* disable that test
2023-12-15 22:43:27 -05:00