Commit Graph

9852 Commits

Author SHA1 Message Date
George Hotz
1d307f568c move device tests to test/device + test cleanups (#11735)
* move device tests to test/device

* test speedups

* test device

* linalg to unit

* upd

* so pytest just works

* more divide and skip

* speed

* test devectorize

* add pillow
2025-08-19 16:02:20 -07:00
wozeparrot
bcc7623025 feat: bump version to 0.11.0 (#11736) v0.11.0 2025-08-19 17:08:56 -04:00
qazal
8c987b3293 DISABLE_FAST_IDIV is a context var [pr] (#11733) 2025-08-19 23:30:50 +03:00
George Hotz
bf467c623d changes from rangeify + better NullRenderer (#11732)
* changes from rangeify + better NullRenderer

* fix test
2025-08-19 12:51:54 -07:00
chenyu
02353588cb small getitem cleanup (#11730) 2025-08-19 12:25:58 -04:00
chenyu
712a5c651a minor Tensor.triu cleanup (#11728)
less confusing dtype
2025-08-19 08:07:38 -04:00
nimlgen
9c9e337c78 amd: parse soc enums (#11727)
* amd: parse soc enums

* remove from mock

* fix

* minimal amd_gpu
2025-08-19 15:06:09 +03:00
qazal
57ad69160a viz: inline memory shape spec (#11725) 2025-08-19 08:03:29 +03:00
chenyu
c5b52e9321 onnx RotaryEmbedding cleanup (#11724) 2025-08-18 23:34:42 -04:00
George Hotz
31619774a9 Revert "Revert "fix the misused cast in amd llvm tc (#11711)" (#11715)" (#11723)
This reverts commit ca28db5a97.
2025-08-18 19:44:35 -07:00
George Hotz
2ea54d7337 improve syntax of UPats using f [pr] (#11717)
Co-authored-by: chenyu <chenyu@fastmail.com>
2025-08-18 20:49:45 -04:00
chenyu
b67345caa3 use truncate in onnx read_int64 [pr] (#11720) 2025-08-18 20:49:35 -04:00
qazal
50e789e290 hotfix: add device to decompositions ctx (#11721)
fast_idiv requires it for checking if a dtype is supported. Without
this, codegen creates non reproducible output without a complete
os.environ. since `is_dtype_supported` will open devices based on the
env var unless the device is specified by the caller.
2025-08-19 03:31:16 +03:00
George Hotz
4b3fcb4064 Revert "REDUCE_AXIS keepdim=False (#11311)" (#11718)
This reverts commit b518a7378a.
2025-08-18 13:28:53 -07:00
George Hotz
67d0ba5bd8 new ops from rangeify (#11716) 2025-08-18 13:13:11 -07:00
George Hotz
4afa0b86bb hotfix: ls -lh on wheel size 2025-08-18 11:52:59 -07:00
George Hotz
ca28db5a97 Revert "fix the misused cast in amd llvm tc (#11711)" (#11715)
This reverts commit 799a637b03.
2025-08-18 11:51:28 -07:00
chenyu
c10e4c4e20 print wheel build size (#11714) 2025-08-18 14:29:47 -04:00
b1tg
b518a7378a REDUCE_AXIS keepdim=False (#11311)
* progress

* fix tests

* fix tests

* remove hack for test_symfold

* fix test_conv.py  on llvm

* hack test_cache_speed

* lint

* remove hack for helper_linearizer_opt

* tests

* fix DSP

* clean up

* remove hack for kernelize.py

* hack for test/test_multitensor.py TestMultiTensor.test_matmul_shard_none

* clean

* uop.r need reshape?

* lower_store cause fail

* fix lower?

* avoid contiguous hack

* 2134

* conv2d count

* remove unused

* hack lower

* reduced and clean up

* fix TestMultiTensor.test_matmul_shard_none

* src sync + fix TestMultiTensor.test_matmul_shard_none

* remove excluded in mop

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
2025-08-18 10:09:17 -07:00
b1tg
61884f2057 add cstyle renderer to the NULL device (#11709)
Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-08-18 09:52:22 -07:00
uuuvn
18db8fa311 Allow choosing leaders in multinode reduce (#11506)
Co-authored-by: wozeparrot <wozeparrot@gmail.com>
2025-08-18 12:43:20 -04:00
b1tg
799a637b03 fix the misused cast in amd llvm tc (#11711)
Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-08-18 09:15:34 -07:00
qazal
fef97547f9 viz: preset the final timestamp (#11712) 2025-08-18 17:51:21 +03:00
chenyu
c30a113b2a support bf16 and fp8 in Tensor.tolist (#11704)
memoryview does not support it, but casting works fine so cast is fine
2025-08-17 15:11:13 -04:00
nimlgen
1c62a3833b am: add versioned_header to load_fw (#11702)
* am: add versioned_header to load_fw

* fix mypy
2025-08-17 20:11:57 +03:00
qazal
eb3c918c5b viz: s/area/height (#11703) 2025-08-17 19:20:01 +03:00
qazal
d762edd694 viz: define tracks in python (#11701)
* viz: defines tracks in python

* update unittests

* figuring it out

* works

* diff cleanup

* math

* y axis is back
2025-08-17 18:19:13 +03:00
qazal
eeeea29171 viz: device list refactor (#11700)
* viz: device list refactor

* paddingTop/padding-top
2025-08-17 15:08:54 +03:00
George Hotz
9366a23eb0 test backward in test_tiny (#11697)
* test backward in test_tiny

* empty
2025-08-16 20:29:39 -07:00
chenyu
4666df71c1 fix test_fuse_and_tc_opt (#11699) 2025-08-16 21:10:53 -04:00
geohotstan
3d7c35d615 add fuse and tc opt bug repro (#11695)
* FINALLY HAVE A SMALL REPRO OH BOY

* show failure in CI

* cleaner?

* 1 possible fix

* Revert "1 possible fix"

This reverts commit 9e0fd215dd.
2025-08-16 18:24:49 -04:00
nimlgen
d1224a7c4a am: check both signatures (#11694)
* am: check both signatures

* fix
2025-08-16 20:01:07 +03:00
qazal
58c8991fa4 add Ops.REWRITE_ERROR (#11689) 2025-08-16 00:56:53 +03:00
qazal
ec4fccb1da viz: pass through RewriteNotReady (#11690) 2025-08-16 00:33:59 +03:00
qazal
e954decb44 viz: pass UOp.st errors (#11688) 2025-08-16 00:07:56 +03:00
nimlgen
bf0c45fd16 system: resource_resize might be unavail (#11680) 2025-08-15 22:03:23 +03:00
George Hotz
4ab9fb2edd explicit fixed point rewrite (#11685)
* explicit fixed point rewrite

* local cache

* fix that
2025-08-15 11:08:41 -07:00
chenyu
5d6963c968 RuntimeError for unsupported dtype in PYTHON (#11686) 2025-08-15 13:59:27 -04:00
nimlgen
b970cd6895 am: fix psp ring completion (#11679)
* am: psp ring timeout + fix 0 fence_value

* no sleep
2025-08-15 20:15:49 +03:00
qazal
c8ba48b223 show rewrite errors in viz (#11684) 2025-08-15 19:09:47 +03:00
George Hotz
560984fd8d small changes from rangeify (#11682)
* small changes from rangeify

* const like thing

* ksym
2025-08-15 08:45:52 -07:00
chenyu
d0d39885c3 onnx in tinygrad (#11675) 2025-08-14 19:57:21 -04:00
wozeparrot
71260a5ea4 feat: only bench openpilot 0.9.9 models (#11664) 2025-08-14 19:27:18 -04:00
chenyu
4ddefbccb4 update setup packages (#11674)
sorted, and added missing 'tinygrad.frontend' and 'tinygrad.runtime.autogen.nv'
2025-08-14 19:24:57 -04:00
chenyu
48c4033ae1 fix pylint for onnx (#11673)
* fix pylint for onnx

* too long
2025-08-14 18:48:02 -04:00
chenyu
e9d0027591 llama MP realize weight after shard (#11672)
* llama MP realize weight after shard

prevents memory spike on device 0

* empty weight for FAKEDATA
2025-08-14 16:17:46 -04:00
nimlgen
4176b24264 amd: support xcc in regs (#11670)
* amd: support xcc in regs

* mockamd

* typong
2025-08-14 21:20:11 +03:00
Sieds Lykles
f399d0d75d Render mod in terms of idiv (#11668)
* Render mod in terms of idiv

* cvar -> var
2025-08-14 19:59:39 +02:00
nimlgen
d747eeed32 amd logs parser based on device (#11669) 2025-08-14 19:49:33 +03:00
geohotstan
1e904155e3 Add Onnx Huggingface to test/models/test_onnx.py (#11468)
* BOOM

* cache extra/huggingface/models/

* why max buffer size is not 0

* override MAX_BUFFER_SIZE

* less models

* remove more models and change cache dir to already cached dir

* only metal

* less is more?

* remove check ops

* why is this not setting the ENVVAR

* ughhhhh just test in models

* only cpu and gpu

* only cpu actually

* just override it idk

* final

* move extra dependencies up top

* simplification

* fix print

* make README better

* revert ops_disk fix for now

* clean up test_onnx

* remove testing fashion clip model cuz sloooowwwwww

* actually let METAL run this

* fix comment mistake

* fix download path in run_models

* does this work?

* cleanup setup and teardown

* contextvar like this?

* prove model is cached

* do I need to increment DOWNLOAD_CACHE_VERSION?

* see if cached with incremented DOWNLOAD_CACHE_VERSION

* use warnings to see if the model exists

* revert DOWNLOAD_CACHE_VERSION stuff and clean up

* add retry to download

* nit
2025-08-14 11:16:41 -04:00