Commit Graph

4618 Commits

Author SHA1 Message Date
chenyu
c814de2dd4 fix bitwise_not for signed int (#8117)
-1 is correct because 2**32-1 is not within int32 range, so in some case clang casts the whole thing into uint32
2024-12-09 02:02:51 -05:00
ttomsa
e22d7b6fb0 fix var vmax inside special (#8116) 2024-12-09 01:16:08 -05:00
qazal
5dd61035f7 revert VALID early folding for now (#8114)
This reverts commit 4074f52317.
2024-12-09 00:34:24 +08:00
qazal
69e48da961 set NOOPT in test_avg_pool3d_failure (#8112)
* set NOOPT=0 in test_avg_pool3d_failure

* noopt should still pass
2024-12-08 10:48:29 -05:00
geohotstan
f8294b3bda add avg pool 3d failure test (#8105)
* add test

* try simplify test case

* add TODO comment
2024-12-07 16:34:38 -05:00
qazal
6be388be86 failing test for const folding breaking indexing [pr] (#8103) 2024-12-07 19:55:02 +08:00
qazal
4074f52317 VALID early folding (#8100)
* fold valid

* :)

* fix test_verify_ast

* keep symbolic working
2024-12-07 18:37:47 +08:00
qazal
07b6d5cf63 assign early folding (#8093)
* assign early folding [pr]

* move to to_si

* -

* fix generate_dataset

* diff too big

* no recreation, no diff

* gzip

* new sops from tiny10

* final try
2024-12-07 17:02:55 +08:00
chenyu
2d321646b8 default tensors to int32 in test_ops (#8097)
torch defaults to int64 but we care more about int32 anyway. remove skipped tests due to int64 not supported
2024-12-06 20:33:36 -05:00
chenyu
564b3a3e1b onnx Bitwise ops (#8095)
free stuff!
2024-12-06 16:58:09 -05:00
qazal
a97b8fa3c5 maskless const can lower without valid, p1 [pr] (#8094) 2024-12-06 23:21:19 +02:00
chenyu
d000c08f04 fix return type of Tensor.pow (#8091)
int to power of int should return int etc, it hints that we would like to have Ops.POW
2024-12-06 13:38:29 -05:00
geohotstan
5184410fc3 combine get inputs and type_parse function in onnx [fixed] (#8081)
* 1 is simpler than 2

* variable name

* change error wording

* shapes for sequence type must be homogeneous

* bug fix for model benchmark

* fix comments too

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-06 12:34:47 -05:00
qazal
df84dc6444 unrelated test fixups from delete_lazy [pr] (#8088)
* unrelated test fixups from delete_lazy [pr]

* fine if it's scheduled later
2024-12-06 17:31:02 +02:00
geohotstan
0b7c44677d Fix uint8 cast underflow (#6305)
* hacky fix for cast

* only float to uint8

* limit to float -> uint8

* touchup alu cast test

* improve tests and support more float to unsigned casts

* del one repeated test

* del 1 more repeated test

* try removing expected failure test

* hmmm try 1 more

* skip tests for flakiness

* uint64 super flaky

* clean up

* grammar

* just match numpy

* why is CI numpy different from local numpy

* increase verbosity

* try

* try2

* try3

* try4

* yeah idk

* new direction

* try again

* just don't support uint32 and uint64

* done?

* oops

* comment

* documentation

* it is what it is

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-06 10:25:03 -05:00
Ahmed Harmouche
f3983f6743 Move efficientnet example (#8087) 2024-12-06 15:48:16 +01:00
qazal
7dbd166227 skip test_schedule_mem_used_with_inputs [pr] (#8086) 2024-12-06 16:44:34 +02:00
qazal
0356657ced move view_supported_devices to device [pr] (#8085) 2024-12-06 16:44:15 +02:00
Ahmed Harmouche
ba35c4138b Use matching JS TypedArray for buffer dtype (#8080) 2024-12-06 14:52:23 +01:00
geohotstan
a684d72e55 add ceil_mode for avg_pool and max_pool (#7579)
* wip pool

* check CI for remove alternative implementation

* Revert "check CI for remove alternative implementation"

This reverts commit 7b1bb900e5.

* fix test

* tests tests tests

* slap a resolve on it

* fix comment

* a little simpler pool

* check CI for removal again

* Revert "check CI for removal again"

This reverts commit be798b7857.

* small

* update

* some ez tests

* english

* clean up code

* fix ruff

* how did I +25 lines?

* small clean ups

* moar clean ups

* try test_avgpool2d_failure2 in CI

* final clean up

* exclude bug fix

* avg underscore pool

* no more edge case stuff

* add better comments for explanation

* add test cases for decreasing end padding

* address feedback

* improve test coverage

* tiny more polish as we wait for lines :D

* more readable code ordering

* add to documentation

* oops

* set to False instead

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-06 08:34:14 -05:00
chenyu
b73d9a7d24 Revert "combine get inputs and type_parse function in onnx (#8069)" (#8079)
This reverts commit 074a67a6eb.
2024-12-06 08:04:21 -05:00
chenyu
a77ee72d11 clean up reshape size check [pr] (#8067)
removed a resolve, and remove special case for 0 size assert since it's covered by generic size check
2024-12-06 07:51:19 -05:00
geohotstan
074a67a6eb combine get inputs and type_parse function in onnx (#8069)
* 1 is simpler than 2

* variable name

* change error wording

* shapes for sequence type must be homogeneous
2024-12-06 07:42:35 -05:00
nimlgen
c0240855b9 qcom has not transfer (#8075)
* qcom alloc is not hcq alloc

* maybe base?

* test
2024-12-06 14:45:01 +03:00
Ahmed Harmouche
ce72fe1411 u32 to f16 in tinygrad (#8074)
* f16 decompression in tinygrad

* Typing and cleanup
2024-12-06 12:00:13 +01:00
George Hotz
e37bff6c19 fix bug in jit prune with copy [pr] (#8073) 2024-12-06 18:38:23 +08:00
George Hotz
aae8557ada test copy inside jit [pr] (#8072) 2024-12-06 17:51:50 +08:00
chenyu
e7d5fe4a32 improve idiv _min_max (#8066)
for the cases that the we don't know the exact bounds, we might still know the sign. with this, can remove some resolve for symbolic shapetracker
2024-12-05 23:02:16 -05:00
Sieds Lykles
49c6dab74b Add pattern for div mod recombine with gcd (#8061)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-05 13:16:58 -05:00
chenyu
5c6ed5dba6 lower test_conv_3x3_256_32_32_256_256 expectation (#8060)
failed https://github.com/tinygrad/tinygrad/actions/runs/12182799887/job/33982676812#step:9:210
2024-12-05 10:30:56 -05:00
Ahmed Harmouche
ff9a89f714 Proper dtypes for input/output of exported WebGPU model (#8053)
* Respect input/output dtypes in exported WebGPU model

* Add some comments about skipped dtypes
2024-12-05 10:38:05 +01:00
qazal
435a51e10c reduce folding simple tests [pr] (#8040)
* reduce folding simple tests [pr]

* test for view and realized src pattern

* realize / buffer behavior
2024-12-05 12:22:45 +08:00
George Hotz
20878be2af lower test_gemv_4096_16384 expectations 2024-12-05 12:08:26 +08:00
George Hotz
df18e7cc37 accept filename decorator [pr] (#8049)
* accept filename decorator [pr]

* add test for safe_load

* bring old tar tests back
2024-12-05 11:40:59 +08:00
chenyu
b3220ca7b1 test cases of always True/False lt (#8048)
* test cases of always True/False lt

* one more
2024-12-04 20:38:40 -05:00
geohotstan
5ce8090d42 simple onnx_ops cleanups (#8003)
* simple clean ups first

* more work

* kinda have adam

* ooo momentum worked nicely

* almost there

* wow.. is the onnx test wrong

* nicer optim stuff

* just skip that test

* small comment changes

* use naming convention from other parts of codebase

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-04 15:33:03 -05:00
Sieds Lykles
70db1bab5c Fold nested div with const (#8010)
* Rebase nested div and with const

* Update the ordering

* return None on vectors

Fixes cpu test

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-04 14:59:09 -05:00
chenyu
0693158d28 lower v_theoretical gemv on red (#8042)
tiny7 is still slower https://github.com/tinygrad/tinygrad/actions/runs/12166149038/job/33931736130#step:8:209
2024-12-04 13:59:40 -05:00
qazal
b116e1511d make device on uop optional [pr] (#8034) 2024-12-04 20:18:00 +08:00
Ahmed Harmouche
13eedd373b Run WebGPU tests on ubuntu (#8033) 2024-12-04 12:42:04 +01:00
George Hotz
08657cb7b0 hotfix: bump expectations in speed_v_theoretical 2024-12-04 19:00:33 +08:00
George Hotz
ea65c79ba2 hotfix: don't spam BEAM debug in speed_v_theoretical 2024-12-04 18:47:16 +08:00
George Hotz
09b00b1b04 hotfix: use kernel timings instead of python timings in speed_v_theoretical 2024-12-04 18:36:17 +08:00
leopf
f0401e14e8 tar_extract with Tensors (#7853)
* initial

* USTAR, PAX and GNU support + testing

* from_bytes byteorder

* use TarInfo.frombuf

* tensor only usage

* remove contextlib.suppress

* shorter ow,pax

* more tests

* testing length + move tests

* cleanup

* new approach: RawTensorIO

* fix fetch

* enable read test

* cleanup and ignore fix

* fix for python < 3.12

* make it RawIO

* functions

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-04 17:03:19 +08:00
uuuvn
e9c5b23ba1 Use MTLCompiler directly (v2) (#7920)
* Use MTLCompiler directly (v2)

* to_block_literal and REQUEST_TYPE_COMPILE

* Rewrite command encoding

* Revert to_block_literal

* Maybe that's more readable to some people?

* Typo and comment about stdlib caching

* Update ops_metal.py

* Update ops_metal.py

* Update ops_metal.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-12-04 16:36:48 +08:00
chenyu
0c060fa040 update uop and tests to not use lt/gt/le/ge [pr] (#8023)
just use dunder methods, eventually remove those from ops
2024-12-03 21:02:52 -05:00
Ahmed Harmouche
db330a3110 Remove WebGL (#8012) 2024-12-03 16:02:53 +01:00
chenyu
ef3752625b add test case of realize_size with 0 in shape (#8011) 2024-12-03 09:19:50 -05:00
George Hotz
09eac42fd6 cache indexed uops in st [pr] (#8008)
* cache indexed uops in st [pr]

* remove arg from range
2024-12-03 21:27:07 +08:00
Sieds Lykles
e44183647f Improved div folding (#7996)
* First version of div_mod folding together

* Working version with old div folding behaviour

* Test is fixed

* Fix linting

* Happy mypy

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-03 08:11:25 -05:00