Commit Graph

10633 Commits

Author SHA1 Message Date
qazal
07b6d5cf63 assign early folding (#8093)
* assign early folding [pr]

* move to to_si

* -

* fix generate_dataset

* diff too big

* no recreation, no diff

* gzip

* new sops from tiny10

* final try
2024-12-07 17:02:55 +08:00
George Hotz
00ac0db9d4 np tensors have the memory from numpy in compile3 [pr] (#8098) 2024-12-07 14:01:51 +08:00
George Hotz
22feb3a2f1 move copy into the JIT for openpilot compile3 (#7937)
* move copy into the JIT, test fails

* ahh, prune was the issue
2024-12-07 13:26:26 +08:00
leopf
0ed731b5ea torch_load with Tensors (#8037)
* torch_load with Tensors

* remove passthrough_reset + use accept_filename

* Revert "remove passthrough_reset"

* version note

* cleanup
2024-12-07 09:55:41 +08:00
chenyu
2d321646b8 default tensors to int32 in test_ops (#8097)
torch defaults to int64 but we care more about int32 anyway. remove skipped tests due to int64 not supported
2024-12-06 20:33:36 -05:00
chenyu
e9692de42b don't FUZZ_ALL_ACTIONS in fuzz_linearizer.py (#8096)
mostly for speed, this is just making sure the script runs
2024-12-06 17:22:17 -05:00
chenyu
564b3a3e1b onnx Bitwise ops (#8095)
free stuff!
2024-12-06 16:58:09 -05:00
qazal
a97b8fa3c5 maskless const can lower without valid, p1 [pr] (#8094) 2024-12-06 23:21:19 +02:00
mesozoic-egg
aaf2379f97 remove ordered parents, seems like dead code [pr] (#8092)
* remove ordered parents, seems like dead code

* no need to dedup
2024-12-06 16:19:37 -05:00
nimlgen
e180a31c5e tiny metal cleanup (#8089)
* tiny metal cleanup

* cast

* sry
2024-12-06 21:44:32 +03:00
chenyu
d000c08f04 fix return type of Tensor.pow (#8091)
int to power of int should return int etc, it hints that we would like to have Ops.POW
2024-12-06 13:38:29 -05:00
qazal
1ea4dc9565 big graph init conceptual cleanup [pr] (#8090)
* keep Ops.BUFFER naming consistent [pr]

* big graph init conceptual cleanup [pr]

* make everything pass through

* pylint doesn't complain now
2024-12-06 20:07:00 +02:00
geohotstan
5184410fc3 combine get inputs and type_parse function in onnx [fixed] (#8081)
* 1 is simpler than 2

* variable name

* change error wording

* shapes for sequence type must be homogeneous

* bug fix for model benchmark

* fix comments too

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-06 12:34:47 -05:00
nimlgen
d1282da7e8 hcq bump alloc (#8078)
* hcq bump alloc

* hm

* nv

* typo
2024-12-06 19:19:04 +03:00
qazal
df84dc6444 unrelated test fixups from delete_lazy [pr] (#8088)
* unrelated test fixups from delete_lazy [pr]

* fine if it's scheduled later
2024-12-06 17:31:02 +02:00
geohotstan
0b7c44677d Fix uint8 cast underflow (#6305)
* hacky fix for cast

* only float to uint8

* limit to float -> uint8

* touchup alu cast test

* improve tests and support more float to unsigned casts

* del one repeated test

* del 1 more repeated test

* try removing expected failure test

* hmmm try 1 more

* skip tests for flakiness

* uint64 super flaky

* clean up

* grammar

* just match numpy

* why is CI numpy different from local numpy

* increase verbosity

* try

* try2

* try3

* try4

* yeah idk

* new direction

* try again

* just don't support uint32 and uint64

* done?

* oops

* comment

* documentation

* it is what it is

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-06 10:25:03 -05:00
Ahmed Harmouche
f3983f6743 Move efficientnet example (#8087) 2024-12-06 15:48:16 +01:00
qazal
7dbd166227 skip test_schedule_mem_used_with_inputs [pr] (#8086) 2024-12-06 16:44:34 +02:00
qazal
0356657ced move view_supported_devices to device [pr] (#8085) 2024-12-06 16:44:15 +02:00
Ahmed Harmouche
fad3eaa35e Use atomicLoad builtin when loading atomic type (#8084) 2024-12-06 15:33:11 +01:00
qazal
79966fade0 free up lines for const_arg [pr] (#8083) 2024-12-06 16:28:51 +02:00
Ahmed Harmouche
ba35c4138b Use matching JS TypedArray for buffer dtype (#8080) 2024-12-06 14:52:23 +01:00
geohotstan
a684d72e55 add ceil_mode for avg_pool and max_pool (#7579)
* wip pool

* check CI for remove alternative implementation

* Revert "check CI for remove alternative implementation"

This reverts commit 7b1bb900e5.

* fix test

* tests tests tests

* slap a resolve on it

* fix comment

* a little simpler pool

* check CI for removal again

* Revert "check CI for removal again"

This reverts commit be798b7857.

* small

* update

* some ez tests

* english

* clean up code

* fix ruff

* how did I +25 lines?

* small clean ups

* moar clean ups

* try test_avgpool2d_failure2 in CI

* final clean up

* exclude bug fix

* avg underscore pool

* no more edge case stuff

* add better comments for explanation

* add test cases for decreasing end padding

* address feedback

* improve test coverage

* tiny more polish as we wait for lines :D

* more readable code ordering

* add to documentation

* oops

* set to False instead

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-06 08:34:14 -05:00
chenyu
b73d9a7d24 Revert "combine get inputs and type_parse function in onnx (#8069)" (#8079)
This reverts commit 074a67a6eb.
2024-12-06 08:04:21 -05:00
Sieds Lykles
c8313a3669 Cleaner rule for mul/idiv by power of two [pr] (#8076)
* Cleaner rule for mul/idiv by power of two

* Change comment
2024-12-06 08:02:24 -05:00
chenyu
a77ee72d11 clean up reshape size check [pr] (#8067)
removed a resolve, and remove special case for 0 size assert since it's covered by generic size check
2024-12-06 07:51:19 -05:00
geohotstan
074a67a6eb combine get inputs and type_parse function in onnx (#8069)
* 1 is simpler than 2

* variable name

* change error wording

* shapes for sequence type must be homogeneous
2024-12-06 07:42:35 -05:00
nimlgen
c0240855b9 qcom has not transfer (#8075)
* qcom alloc is not hcq alloc

* maybe base?

* test
2024-12-06 14:45:01 +03:00
Ahmed Harmouche
ce72fe1411 u32 to f16 in tinygrad (#8074)
* f16 decompression in tinygrad

* Typing and cleanup
2024-12-06 12:00:13 +01:00
George Hotz
e37bff6c19 fix bug in jit prune with copy [pr] (#8073) 2024-12-06 18:38:23 +08:00
George Hotz
aae8557ada test copy inside jit [pr] (#8072) 2024-12-06 17:51:50 +08:00
George Hotz
e2fe7f0d2f hotfix: actually fix pylint, it's a python 3.10 issue 2024-12-06 13:53:46 +08:00
George Hotz
b28d660172 update self_tokenize, fix pylint maybe 2024-12-06 13:49:41 +08:00
George Hotz
344fd4845c example: self_tokenize. someday tinygrad will be recursively self improving 2024-12-06 13:35:02 +08:00
JaSpa99
3c5d5f9414 mypy==1.13.0 (#7990)
* explicit instantiation and narrowing asserts

* explicit cast

* bump

* one line assert

* handle case for no copy_queue_t

* Revert "handle case for no copy_queue_t"

This reverts commit 38347806ca.

* more readable control flow

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-12-06 12:09:14 +08:00
leopf
65b6696f3b refactor safe_load (#8035)
* refactor safe_load

* cleanup
2024-12-06 12:08:21 +08:00
chenyu
e7d5fe4a32 improve idiv _min_max (#8066)
for the cases that the we don't know the exact bounds, we might still know the sign. with this, can remove some resolve for symbolic shapetracker
2024-12-05 23:02:16 -05:00
chenyu
13b954f22c unify expand conditions [pr] (#8065)
same condition (check if old == new or old == 1) in tensor and view. also renamed _pad_left to _align_left because it's not really a pad
2024-12-05 21:40:14 -05:00
chenyu
aefdff4ef5 reshape mask cleanups [pr] (#8064)
don't need canonicalize_st because we always merge 1 in `_merge_dims`
2024-12-05 20:20:43 -05:00
chenyu
05dba6e4ee minor to_indexed_uops cleanup [pr] (#8063) 2024-12-05 17:15:03 -05:00
chenyu
b2dd703592 fix typing of UOp.range [pr] (#8062)
start/end should not be float or bool
2024-12-05 14:56:34 -05:00
Sieds Lykles
49c6dab74b Add pattern for div mod recombine with gcd (#8061)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-05 13:16:58 -05:00
geohotstan
707e9a9c8e add _one_hot_along_dim helper for Tensor.arange masking (#8039)
* feelsbadman

* feelsextrabadman

* make sure indices is on same device as self Tensor

* renamed to _one_hot_along_dim

* revert onnx change will do them in onnx only PRs

* address feedback

* add onnx changes here too

* make pad arg better

* revert pad arg

* maybe still keep dim

* simplify onehot onnx ops more

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-05 12:43:00 -05:00
chenyu
3c5983473a combine parentless reduce rule [pr] (#8059) 2024-12-05 11:28:35 -05:00
chenyu
87594a8153 simpler dtypes.max for int [pr] (#8058) 2024-12-05 10:31:41 -05:00
geohotstan
66b8242375 Simple onnx.py clean ups (#8054)
* start

* simplify ops

* why did this not work before

* will split buffer parse to separate pr

* flip the error order

* only this much for now

* to_python_const clean up

* minimize diff

* move tensor_methods into onnx.py

* improve some type signatures

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-05 10:31:26 -05:00
chenyu
5c6ed5dba6 lower test_conv_3x3_256_32_32_256_256 expectation (#8060)
failed https://github.com/tinygrad/tinygrad/actions/runs/12182799887/job/33982676812#step:9:210
2024-12-05 10:30:56 -05:00
Ahmed Harmouche
c6f5bb03fa YoloV8 WebGPU fixes (#8057)
* Bump up input size to 416, show if webgpu is not supported

* Minor fix in export_model
2024-12-05 16:23:45 +01:00
nimlgen
78c01a5c2b amd general _gpu_alloc (#8056)
* amd general _gpu_alloc

* hmm

* ops
2024-12-05 15:50:23 +03:00
nimlgen
8071600897 nv one _gpu_alloc (#8055) 2024-12-05 15:22:03 +03:00