Commit Graph

145 Commits

Author SHA1 Message Date
qazal
81a71ae0f6 hotfix: skip test_exclude_const_metadata (#9208) 2025-02-22 23:26:04 +02:00
qazal
4578c3e8fd simpler tensor metadata mapping + tests [pr] (#9203)
* simpler tensor metadata mapping + tests [pr]

* remove kernel metadata

* don't map nones
2025-02-22 20:18:46 +01:00
chenyu
2e7c2780a9 CLANG -> CPU (#9189) 2025-02-20 18:03:09 -05:00
George Hotz
a4dab3ec3f add name uop (#9149)
* add name uop, TODO: refactor renderer to use

* renderer uses name uop

* fix tests

* render

* ptx
2025-02-18 15:26:58 +08:00
George Hotz
df3b320f46 rewriter -> devectorizer [pr] (#9147) 2025-02-18 12:42:08 +08:00
Josh Moore
44e0eab8fd Fix AttributeError occurring after ValueError in _apply_uop (#8905)
* Fix AttributeError occurring after ValueError in _apply_uop

* Update tensor.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-02-06 10:56:29 +08:00
chenyu
1f730ae8f8 remove retain_graph in Tensor.backward [pr] (#8835)
not used. gradient accumulation works directly
2025-01-31 13:41:26 -05:00
George Hotz
b4bf6a7dea switch backward to use gradient [pr] (#8235)
* switch backward to use gradient [pr]

* set device correctly, dedup

* why does that fail?

* add noop cast

* simple backward

* fix beautiful_mnist

* touchups

* set in compute_gradient

* uop_count

* uop_count was wrong

* collections

* no note

* skip that test

* update sched kernel counts

* train mnist is 65

* fix metadata and gc

* fixes

* materialize_grads

* no pathlib stuff

* add contiguous_backward, fix bugs

* add some realize

* fix multi
2025-01-26 09:12:16 +09:00
qazal
6cb74bb630 fix using clone with shrink [pr] (#8724)
* fix using clone with shrink [pr]

* remove extra arg, add test_clone_with_shrink_realized
2025-01-23 08:28:07 +02:00
George Hotz
98d01a059d rename uopgraph to rewriter [pr] (#8682) 2025-01-19 17:03:12 -08:00
mesozoic-egg
3506a7585f upcast overflowed idx to int64 [pr] (#8268)
* use full_shape to determine if index can potentially overflow

* update comment

* use shapetracker to check max index value

* wip

* lint

* handle mask

* upcast to int64 by st is noop on WGSL

* fix comments

* Handle negative overflow, intermediaries overflow, int64 support

handle negative overflow

handle symbolic

wip

handle intermediate values

wip

check if typemap support int64

lint

comment

* add invalid_dtype

lint

* Fix bug on checking mask overflow

wip

wip

* Add more tests, need to resolve partial upcast

test Valid_view_dup

test valid op overflow

refine test cases

clean up

cleanup

wip

refine tests

lint

* Upcast is handled by lower_load_store

upcast as graph_rewrite to backtrack

update test

wip

cleanup

wip

cleanup

do upcast in lower_load_store

lint

* cleanup

* do upcast within lower_load_store and mutate ctx

* do upcast in get_idx and view

revert

lint

* cleanup

* Upcast in vec, const

upcast to const

test case 3

upcast on vector

lint

* simplify idx with symbolic in case of fake overflow

test case4

test case 4

update test

* test case4 is only for metal

* try: upcast inside graph_rewrite instead of shapetracker

wip

* checking overflow can just be done directly on all views, with idxs

* cleanup

* REMOVE hard coded uop test for idx upcast

* refactor

cleanup

refactor

* do actual casting when necessary, instead of rewriting all idx

hard code uop test

new upcast

* check dtype for int64 in webgpu

* cleanup

cleanup

* cleanup

* update tests

cleanup

comment

cleanup

cleanup

* comment

* comment

* update comment

update comment

* refactor

* typo

* keep the scope to only upcasting

* white space

* Revert "white space"

This reverts commit 314d7eb184.

* Revert "keep the scope to only upcasting"

This reverts commit 1ef701dd85.

* sym folding is not necessary

lint1

* fold symbolic

lint

* use symbolic simple when folding shapetracker idx

* full sym folding is required after all...

* Ops.CAST should retain the src min max

* put rewrite to lowerer

wip

* start testing on higher level

wip

test higher level in test_tensor

* find Ops.STORE in list instead of recursively

* check dtype support when upcasting

* remove invalid_dtype

* lint

* fix int64 support checks in upcast

lint

* skipif skipunless

* revert fold to find test case

* Revert "revert fold to find test case"

This reverts commit 225bb6e801.

* test sym folding

* handle ptx

* wip

* wip

* delete hard coded uop test

* lint fixes

* wip

* fix checking for None

* lint

* handle ptx

* comment

* dtype for overflow()

* update skipIf skipUnless

* assert in wgsl renderer for int64

wip

* do folded_upcast in to_indexed_op, real_size uses views_to_indexed_ops

* assert in lowerer for dtype support

lint

* Revert "assert in lowerer for dtype support"

This reverts commit 8e9b1b79bf.

* assert dtype in kernel.py

* Revert "assert dtype in kernel.py"

This reverts commit e29b9a9893.

* wip

* assert in render

* remove old assert

* check dtype from rendere, assert in upcast

wip

* smaller arange for sym fold case

* linearize directly

* use expand directly

* lint

* lint

* rename

* no need to check dtype in device.py

* trigger pr

* remove dtype assert in upcast, make wgpu fail in render

* use DType for type hint instead of dtypes

* assert on KeyError in tests for webgpu backend int64

* use a tuple for src

* test real kernel run

wip

* lint error

* restore

* fix real_size

* update test example

* resolve merge stuff

---------

Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.mail>
2025-01-17 11:52:31 -05:00
George Hotz
c85737c200 assert to prepare for grad uop [pr] (#8280)
* assert to prepare for grad uop [pr]

* fix test_nn

* fix most of test_tensor

* few more tests

* fix multi

* uniform gradient

* acc_dtype

* any for multi

* fix typing

* fix assert, CAST_BEFORE_VIEW is still the issue

* explict test for CAST_BEFORE_VIEW

---------

Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2025-01-14 13:26:56 -08:00
chenyu
c4e33048c6 test Tensor.clone has a different lazydata [pr] (#8600) 2025-01-13 20:13:44 -05:00
qazal
866dfa1f23 create_schedule([x.lazydata]) -> x.schedule() in tests (#8449) 2024-12-31 03:15:52 +08:00
George Hotz
bd9c015b09 tests from grad uop path [pr] (#8313) 2024-12-18 09:25:05 -08:00
qazal
d05e21cb69 replace lazy srcs with the new uop api [pr] (#8255)
* buf_uop_view function

* srcs shouldn't exist

* fix TestTensorMetadata

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2024-12-15 17:09:54 +08:00
George Hotz
734f2c5344 compute gradient [pr] (#8237)
* compute gradient [pr]

* schedule_step_with_grads

* second deriv works
2024-12-13 20:46:01 -08:00
chenyu
40a4c603b9 remove more test skip for webgpu [pr] (#8192) 2024-12-12 14:06:35 -05:00
chenyu
72ff631f8d remove unreachable tensor dtype assert (#8190)
it would have failed in `to_dtype`. added some tests for it too
2024-12-12 13:04:49 -05:00
George Hotz
c8e7707a7e hotfix: disable flaky move tensor test 2024-12-10 17:11:21 -08:00
Ahmed Harmouche
a8cfdc70ed Run more webgpu tests (#8142) 2024-12-10 23:20:04 +01:00
chenyu
a77ee72d11 clean up reshape size check [pr] (#8067)
removed a resolve, and remove special case for 0 size assert since it's covered by generic size check
2024-12-06 07:51:19 -05:00
chenyu
3d82f8e340 simpler rand_like (#7680) 2024-11-13 12:28:41 -05:00
George Hotz
205befa788 move is_dtype_supported to device [pr] (#7575) 2024-11-07 20:38:03 +08:00
Tobias Fischer
1a9e145388 Tensor Clone Function (#7154)
* implemented clone function

* cleanup linting, single func

* added tests, cleaned up grad cloning

* fixed whitespace
2024-11-01 12:24:43 +08:00
qazal
7149eabb34 assert set equality in TestTensorMetadata [pr] (#7364) 2024-10-29 19:29:29 +08:00
qazal
0ebdb136e8 revert metadata with graph_rewrite (#7353) (#7362)
This reverts commit 540e4179e7.
2024-10-29 19:16:31 +08:00
qazal
540e4179e7 global UOp to Metadata mapping + inverse DEBUG=2 metadata order [pr] (#7353)
* add ctx.buf_metadata [pr]

* revert metadata insertion order

* lint rename
2024-10-29 17:12:00 +08:00
qazal
e46edc22aa use unittest helpers in TestTensorMetadata [pr] (#7329)
* use unittest helpers in TestTensorMetadata [pr]

* fix that

* 5 args
2024-10-28 18:38:30 +08:00
Bhavya Gada
534597e753 fix all test warnings (#7024)
* fix pytorch warning in nn.conv2d for same padding

* fix future warning in torch load

* fix overflow warning in tensor list test: https://github.com/numpy/numpy/issues/23606#issuecomment-1512752172

* fix floating point warnings in dtype tests using docs https://numpy.org/doc/stable/reference/generated/numpy.errstate.html and a neat solution https://stackoverflow.com/questions/53634965/change-np-seterr-behavior-inside-a-function-only

* put err state in one place; comment taken care of by function hover

* enter np errstate context manager on test setup

* put decorator on class
2024-10-18 08:56:40 +08:00
nimlgen
3c56aeee70 add Tensor.from_blob (#6765)
* draft tensor from pointer init

* some docs and types

* comment

* cleaner

* test

* malloc

* qcom cl interop

* jit example

* cleaner

* dealoc

* wording

* docs
2024-09-26 18:33:19 +08:00
David González Martínez
724e408736 add support for retain_graph in backward (#6145)
* add support for retain_graph in backward

* fix: dont accumulate grad on non-leaf tensors

* fix order

* fix: do not delete grad on leafs

* fix linter

* fix: can't exactly match torch behaviour internally

* allow numerical room for test

* refactor
2024-08-18 16:08:31 -07:00
George Hotz
17a043edad tensor inference (#6156)
* tensor inference

* test is even better name
2024-08-18 00:19:28 -07:00
Jun Zhang
54e176fb4f Ignore non-computational backends when overwriting the default (#5770) 2024-08-10 09:23:29 -07:00
qazal
e6d41b0ce7 hotfix: adjust test_backward_pass_diamond_model thresholds (#5981) 2024-08-09 00:20:53 +08:00
David González Martínez
0f09b94c43 add failing test for second order derivatives (#5772)
* add failing test

* fix lint

* fix bad merge

* fix again

* fix test

* more minimal
2024-08-01 02:34:47 -07:00
David González Martínez
d0fd84e617 feat: allow passing gradient to .backward() to compute vjp (#5771)
* feat: allow passing gradient to .backward() to compute vjp

* fix

* refactor

* fix trailing whitespace
2024-07-28 11:13:18 -07:00
chenyu
e41ab66653 use is to compare types (#5476)
new rule in latest ruff
2024-07-14 14:26:41 -04:00
wozeparrot
9150a6be7a tensor metadata (#5271) 2024-07-08 17:45:40 -07:00
chenyu
cc2be9064f fix out of bound python list into numpy array (#5043)
numpy 2.0 does not allow oob python const and recommends writing as `np.array(value).astype(dtype)`
2024-06-18 18:05:21 -04:00
chenyu
2b2488f2e2 revert creating Tensor from a list without numpy (#5041)
the change was incomplete and broke creating Tensor from a list of np array
2024-06-18 17:31:22 -04:00
chenyu
acaf9a490d RECIP(-0.0) should be -inf (#5024)
* RECIP(-0.0) should be -inf

added test_dtype_alu for PYTHON backend

* catcht that

* fix those two
2024-06-17 22:26:58 -04:00
chenyu
03b367c014 handle float16 overflow in PYTHON (#5022)
* handle float16 overflow in PYTHON

use `truncate` when constructing tensor from list to make sure all values are packable (might be slow, but should be correct). add truncate_fp16 to cast overflowed values to inf/-inf.

* all valid fmt supports truncate
2024-06-17 21:12:52 -04:00
chenyu
64cda3c481 raise TypeError calling len() on a 0-d tensor (#4970)
matched numpy and torch
2024-06-14 16:34:27 -04:00
chenyu
67e8df4969 remove numpy from dtype (#4969)
replaced all dtype.np with _to_np_dtype defined in tensor.py.

after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer
2024-06-14 15:38:45 -04:00
chenyu
dae1c8abe2 create Tensor from bytes without numpy (#4964) 2024-06-14 13:37:27 -04:00
chenyu
287d3c3b84 support list, tuple input in dtypes.from_py (#4945)
* support list, tuple input in dtypes.from_py

and used it to infer dtype from python list and tuple in Tensor constructor.

* fix tests
2024-06-13 13:38:06 -04:00
chenyu
7aecea4f56 support creating Tensor from python tuple (#4944)
added a small fuzzer to test data with mixed tuple and list of numbers matched with numpy
2024-06-13 12:18:37 -04:00
chenyu
45083ccb43 canonicalize 0 in shape in View.create (#4815)
set strides to 0, offset to 0, mask to None, and contiguous to True with size 0 view.
2024-06-03 13:37:37 -04:00
chenyu
8942230b1f minor cleanups of test_tensor and extend some cases (#4794) 2024-05-31 10:43:22 -04:00