Commit Graph

623 Commits

Author SHA1 Message Date
geohotstan
cea5853cfa add Tensor.scatter (#7737)
* working I think

* where are my onnx scatter tests??

* forward_only for now

* try if nan hack fix NV

* looks like issue is different... CUDA WHY

* oops that was wrong. Try if this fixes CUDA

* simpler multiply

* actually finish this up tmrw morning :x

* fix tests?

* improve tests

* improve test and implementation

* fix ruff

* complete but lots of expected failure...

* reviewed tests

* add onnx tests

* is this a processing op?

* add return type to indicate that it's not in-place

* final cleanups

* use or and improve tests a little

* add masked_index_select

* call it masked_setitem instead

* try

* FIXED

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-27 10:52:04 -05:00
geohotstan
753f07e193 add circular pad mode to Tensor.pad (#7918)
* start

* send it

* no more neg circular pads

* quick fix onnx too

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-27 10:30:51 -05:00
Ahmed Harmouche
10618aba98 Bring back WebGPU (#7063)
* Start from andredaprato:webgpu-clean

* Fix infs

* inf wgsl function is not needed

* Emulated ulong for threefry, more tests passing

* Randomness tests passing

* Update model export to support new changes in webgpu, efficientnet export works again

* Simplify shift emulation in wgsl

* Delete test file

* Fix bigger than u32 u32 literal

* Why was skip copies added here?

* Python3.12 for webgpu tests

* Fix model export syntax error

* Get test ops passing with some skips

* Fix lint

* Much simpler shift

* Run more tests

* Timestamp queries are not supported in CI, so skip search tests

* All fancy indexing passing

* r is ctx

* Run more dtype tests by using is_dtype_supported

* Cleanup ulong shift rendering

* UPat -> Pat, UOps -> Ops

* Pat -> UPat

* Refactor render_ushift if-else

* Pattern to avoid ulong mul

* Remove vals_dtype

* is_nan trick + rewrite, test_isnan passing

* Rewrite a * select(1, nan, gate) -> select(a, nan, gate)

* No arg, just op

* Support char, uchar, short, ushort

* Run test_index_mnis now that we have uint8

* Fix pyling

* Save 3 lines by using base Compiler

* No more long emulation

* Remove fixup_binops

* No more external_local_bufx wgsl specific cstyle modif, use base extra_pm

* Simpler, faster copyin/out

* Skip some new tests that use long

* Fix typo

* copyout touchup

* Save lines by using render_cast

* WebGL is not supported in core, delete it from is_dtype_supported

* More narrow test skips for some unary tests

* TernaryOps, UnaryOps -> Ops

* TinyGrad supports WebGPU

* StableDiffusion demo: f16tof32 gpu is a lib, update UI

* Packed load/store, no more scale_size, no core tinygrad changes

* Rename copyin, copyout

* Device -> dev

* Fix lint

* Pattern matcher rule for packed load/store

* Refactor

* Shorter packed load/store

* this should fix lint

* Fix mypy

* SD compile script working

* New SD webgpu UI

* New default prompt

* New SD weights

* Fix title when webgpu not available

* Run symbolic tests, simplify is_nan, use round_up

* Show step time on UI

* Bump minimum wgpu version to v0.19

* Fix latent

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-11-26 12:26:40 +08:00
chenyu
3b26e51fce Tensor.cummax (#7854)
generalized the existing cumsum and take Ops.MAX in addition to Ops.ADD
2024-11-22 15:55:02 -05:00
geohotstan
cf1ec90ad4 add inverse trig functions to Tensor (#7805)
* implement inverse trig functions

* guess we should still test nans?

* magnitude as variable name :D

* reorder onnx_ops ops

* approximation -> x for consistency

* address feedback

* simpler acos

* improvement?

* actually just have asin depend on atan

* actually this is nicer

* remove a comment

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-21 09:13:36 -05:00
geohotstan
66a069ee25 add replicate mode to Tensor.pad (#7802)
* base implementation

* add tests

* actually remove the assertionerror test

* good
2024-11-20 08:39:58 -05:00
geohotstan
8100109c9d Add replicate mode to Tensor.pad (#7608)
* base implementation

* add tests

* actually remove the assertionerror test

* actually only have reflect for this pr

* change the 4 if-else one liner

* maybe use a lambda

* fix

* maybe a lil cleaner

* fix tests

* complete

* small change

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-18 10:55:38 -05:00
chenyu
df817297b6 fix passing acc_dtype="" to Tensor.prod should fail (#7750)
similar to sum
2024-11-17 11:38:13 -05:00
chenyu
55707fd00d fix passing sum_acc_dtype="" to Tensor.sum should fail (#7748) 2024-11-17 10:58:41 -05:00
chenyu
a15a900415 fix Tensor.meshgrid for 1D input and check indexing (#7740) 2024-11-16 23:39:30 -05:00
geohotstan
72a41095bc add Tensor.meshgrid (#7714)
* initial implementation and test

* some other places that can use meshgrid

* revert the onnx_ops change

* add to docs

* revert interpolate too

* update

* improve edge case test

* might as well test grad

* add to test can improve docs

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-16 23:06:47 -05:00
chenyu
f1efd84c92 fix repeat_interleave with negative dim (#7734) 2024-11-16 10:15:29 -05:00
chenyu
22da31b223 clean up Tensor.dot (#7728)
more docs (similar to numpy) and removed many confusing  `-min(n2, 2)`
2024-11-15 18:21:15 -05:00
chenyu
4338c450ac fix max_pool2d for int tensor with padding (#7726)
padding inf messed output dtype
2024-11-15 16:22:11 -05:00
chenyu
9fb396f660 test_ops maxpool2d -> max_pool2d (#7696)
and avgpool2d -> avg_pool2d for better grepping the tests
2024-11-14 10:39:12 -05:00
geohotstan
f8056a74d6 combine pad2d with pad (#7677)
* I have pad2d, I have pad, uuh~, pad2dpad~

* fix some small things

* strategically placed cast hack

* fix more

* fix more more

* tests

* periods
2024-11-14 17:56:02 +08:00
chenyu
333f5f9f8b Tensor.bitwise_not (#7688)
implemented with xor in tensor for now to not add another op. also used it in Tensor.min to fix dtype int on -2**31
2024-11-13 16:31:52 -05:00
chenyu
fb933b79a6 add test case for nll_loss with input > 2D (#7685)
* failed test case for nll_loss with input > 2D

* fixed

* add more
2024-11-13 14:34:07 -05:00
geohotstan
9c41c376d3 add Tensor.nll_loss (#7683)
* move nll_loss to new branch

* make nll_loss examples practical

* self *is*

* add to docs

* small
2024-11-13 13:12:13 -05:00
chenyu
3c6fe4b79a fix Tensor.bitwise_and and Tensor.bitwise_or to support bool (#7684) 2024-11-13 13:10:39 -05:00
James
d4e4a084a1 fix: Tensor min function for unsigned ints (#7675)
* add failing tests for uint8 `min()`

* fix unsigned data type min()

* fix test data

* fix whitespace

---------

Co-authored-by: rezaarezvan <reza@rezvan.xyz>
Co-authored-by: Jamesb <experimentallearning0@gmail.com>
2024-11-13 11:04:27 -05:00
Reza Rezvan
23363dee55 Add: failing tests for uint8 min() (#7669)
* add failing tests for uint8 `min()`

* mark as expected failure
2024-11-13 22:12:53 +08:00
chenyu
c06a5a9c72 Tensor.linspace raises for dtype.bool (#7649)
also fixed an assert when passing str dtype to randint
2024-11-11 23:05:14 -05:00
geohotstan
5eef59d732 add Tensor.linspace (#7609)
* add linspace

* shave off tests and forgot to add to docs crap

* WHOOPS

* better tests
2024-11-12 10:29:36 +08:00
George Hotz
745316493c hotfix: add test_simple_conv2d_bias 2024-11-10 18:36:42 +08:00
George Hotz
205befa788 move is_dtype_supported to device [pr] (#7575) 2024-11-07 20:38:03 +08:00
geohotstan
934fb73994 fix test_schedule conv2d bug (#7549)
* tests tests tests

* slap a resolve on it

* fix comment
2024-11-05 09:07:25 -05:00
Ahmed Harmouche
36488a2a43 Use is_dtype_supported in more places in tests (#7529) 2024-11-04 09:21:15 -05:00
geohotstan
b1866cbfd9 failure test case for pool ops (#7483)
* add failure test case

* minimum case
2024-11-02 12:13:38 -04:00
geohotstan
585f3a0f24 Add isinf and isnan ops to Tensor (#7484)
* move isinf and isnan to new branch

* sneak a roll documentation fix in

* add to docs

* update test coverage for detect_positive and detect_negative

* add types to isinf args
2024-11-02 12:12:52 -04:00
geohotstan
6513690223 Add Tensor.hardsigmoid (#7433)
* move hardsigmoid to new branch

* add to test

* add NOTE to mention differing values for alpha and beta that match torch

* shift from relu6

* correct shift implementation

* or we just use relu? no more 666
2024-11-01 08:36:52 -04:00
chenyu
fb694a63eb Tensor.erf (#7419)
the same one used in onnx and the one in bert.
2024-10-30 18:12:28 -04:00
George Hotz
f3bd5cbf78 simplest migration of indexing [pr] (#7402)
* simplest migration of indexing [pr]

* fix locals/barrier
2024-10-30 20:58:18 +08:00
chenyu
f389e1a8a0 test more special values for sin/cos/tan [pr] (#7386) 2024-10-29 21:13:37 -04:00
George Hotz
3989bd2682 idiv + reciprocal [pr] (#7354)
* idiv + reciprocal

* remove upcast from div

* fix docs
2024-10-29 15:54:19 +08:00
George Hotz
d9d4dd6756 faster ci [pr] (#7348) 2024-10-29 14:01:44 +08:00
chenyu
0843734927 clean up nan handling in transcendental (#7332)
* clean up nan handling in transcendental

* skip remu crash
2024-10-28 16:21:49 -04:00
chenyu
cb5702f170 tiny cleanup to transcendental xexp2 (#7326)
also added test for exp and log of nan and inf
2024-10-27 21:54:20 -04:00
George Hotz
3c31497f55 instant isn't actually used [pr] (#7299)
* instant isn't actually used [pr]

* tolerance bump
2024-10-25 21:01:29 +08:00
chenyu
13575f080a remove bitcast backward in function.py (#7031)
bitcast cannot backward
2024-10-13 10:08:27 -04:00
Markiian Novosad
8831c691e2 Add slice parameter type checking to disallow Tensor usage for slices (#6967)
* add support for single el tensors for slices

* rm trailing spaces

* cleanup long lines

* remove tensor in slice support, add comprehensive err msg

* cleanup getitem, add slice type check

* Edit err message
2024-10-11 16:20:21 -04:00
chenyu
e4c0743188 failed example for logcumsumexp (#6936)
need cummax for numerical stability
2024-10-07 10:55:45 -04:00
jeffzh4ng
19a7e41113 implement logcumsumexp (#6921)
* implement logcumsumexp

* change axis=None to axis=0
2024-10-06 10:45:36 -04:00
George Hotz
c178dc1071 faster uops ci [run_process_replay] (#6774) 2024-09-26 20:15:01 +08:00
George Hotz
e945fa9c5c put local on the PtrDtype [run_process_replay] (#6656)
* put local on the PtrDtype [run_process_replay]

* those are local too
2024-09-23 10:29:17 +08:00
Gaétan Lepage
f214bb140d test: relax tolerance of test_broadcastdot (#6560) 2024-09-17 03:26:39 -04:00
chenyu
b2c286f567 fix typing for test_ops (#6520)
mostly passed TYPED=1 python3 -m pytest -n=auto test/test_ops.py.

one last test specifically set an invalid value to test the exception, and to ignore that we need to import typeguard. And to get a working version of typeguard, we would need to get rid of dependency on tensorflow_addons because it requires a very old version of typeguard
2024-09-15 06:18:36 -04:00
chenyu
7df4373fd9 tensor reduction touchup (#6402)
- fixing spacing
- use get_args to get valid Literal values and raise ValueError to match, and a test for that
- use `Y` to be consistent
2024-09-08 03:55:51 -04:00
Irakli Salia
2e01efc35f tensor roll (#6375)
* tensor roll function and tests

* fix type annotations

* reduce line count

* more readable
2024-09-07 05:14:28 +08:00
Tim Becker
dfb818788e Support reduction parameter in more loss functions (#6302) 2024-09-07 05:11:20 +08:00