qazal
687d157906
delete cast early folding from ops [pr] ( #9228 )
2025-02-24 19:00:51 +01:00
George Hotz
c9493e41a6
reorder expand ( #9051 )
...
* reorder expand
* symbolic ops needs resolve here
* s/arg/st + whitespace
* viz
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2025-02-24 13:55:47 +01:00
qazal
14aa2395d0
allow VIEW(BUFFER) in Tensor UOps [pr] ( #9210 )
...
* allow VIEW(BUFFER) in Tensor UOps [pr]
* still reshapes
* update becomes_map tests
* bring copy folder to the scheduler
* lint
* only sgd left
* optimizer assign
* 13 kernels
* rename to test_reorder_expand + assert VIEW
2025-02-24 13:06:15 +01:00
nimlgen
1d06d61b16
from_blob for cuda ( #9223 )
...
* from_blob for cuda
* maybe docs?
* minor docs
* example
* waiting 9224
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-02-24 14:02:06 +03:00
George Hotz
fc32ff80d6
torch and numpy dtype interop [pr] ( #9224 )
...
* torch and numpy dtype interop [pr]
* less lines
* order
2025-02-24 18:26:49 +08:00
George Hotz
24615db5f5
hotfix: torch cuda interop example
2025-02-24 09:02:48 +00:00
George Hotz
fd731e740a
hotfix: add note on backend2.py
2025-02-24 11:23:03 +08:00
albanD
f2dd9c1562
simplify c++ code ( #9221 )
2025-02-24 11:04:41 +08:00
qazal
d12efc95d4
support custom name function in viz [pr] ( #9219 )
...
* support custom name function in viz [pr]
* title case
* assert name count in test_track_rewrites_name_fxn
2025-02-24 03:03:25 +02:00
chenyu
b3ae664d5d
fix gradient of pow(t, int) ( #9217 )
...
semi revert some pow logic back to tensor. added direct gradient check because the backward in test_ops passed by luck
2025-02-23 17:42:09 -05:00
qazal
12b5b83821
set TRACK_MATCH_STATS=0 for real_strides [pr] ( #9216 )
2025-02-23 23:26:31 +02:00
qazal
9db0ec46a7
simpler buf_uop [pr] ( #9215 )
...
* simpler buf_uop [pr]
* assert after realize it's buffer
2025-02-23 19:23:14 +01:00
qazal
898aafe6fd
move split_reduceop to scheduler + enable it for multi ( #9214 )
...
* move split_reduceop to scheduler + enable it for multi
* merge r and _reduceop
2025-02-23 17:30:04 +01:00
ShikChen
05e3202fba
remove unused memsize_to_str and minor cleanups [pr] ( #9211 )
...
* fix edge cases in memsize_to_str()
Inputs <= 1 now return "0.00 B" for 0 and "1.00 B" for 1, avoiding an
IndexError. Also, memsize_to_str(1000) now returns "1.00 KB" instead of
"1000.00 B".
Replaced the list comprehension with a next(...) generator for conciseness
and efficiency.
* simplify code using idiomatic python
- Remove the unused `memsize_to_str()` function in helpers.
- Use a tuple for checking multiple string prefixes/suffixes.
- Avoid unnecessary list construction by using iterables directly.
- Check None in @diskcache to ensure proper caching of falsy values.
* revert generators back to list comprehension
Sometimes building list first could be faster. Keep it as is.
2025-02-23 09:58:37 -05:00
qazal
81a71ae0f6
hotfix: skip test_exclude_const_metadata ( #9208 )
2025-02-22 23:26:04 +02:00
chenyu
e0adb1fc76
really run test_ops with TINY_BACKEND in ci ( #9206 )
...
was failing with `line 1: pytest: command not found`
2025-02-22 15:51:24 -05:00
qazal
e6d20c47e3
simpler becomes_map update [pr] ( #9201 )
...
* simpler becomes_map update
* err, no metadata for device
* simpler tensor metadata mapping + tests [pr]
* remove kernel metadata
* don't map nones
* pruning
* linter
2025-02-22 20:50:58 +01:00
qazal
4578c3e8fd
simpler tensor metadata mapping + tests [pr] ( #9203 )
...
* simpler tensor metadata mapping + tests [pr]
* remove kernel metadata
* don't map nones
2025-02-22 20:18:46 +01:00
qazal
b711c6343a
no early return + allow childless const/bind/var in kernel graph [pr] ( #9202 )
2025-02-22 19:28:22 +01:00
George Hotz
97bc723538
torch backend works for ResNet-18 ( #9200 )
...
* torch backend progress, a few more functions
* resnet works
* pillow
* tv
2025-02-22 22:16:23 +08:00
George Hotz
f92820d30d
torch backend tests ( #9198 )
...
* torch backend tests
* pythonpath
* install ninja
2025-02-22 16:01:49 +08:00
George Hotz
4e6665bda5
different way to write torch backend ( #9197 )
...
* different way to write torch backend
* both backends
* more work
* simpler code
* more work
* test both
* imply unwrap/wrap
* FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works
* ready to start making test_ops work in torch backend
* backward pass, TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works
* FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_simple_conv2d works
* matmul backward is broken with as_strided
2025-02-22 14:42:26 +08:00
nimlgen
041b6d5678
am: load fw in batches ( #9185 )
...
* am: load fw in batches
* am: 1mb less fw copies
* mypy
* list
2025-02-21 23:21:31 +03:00
qazal
1db4341e9f
move viz graph to lib/graph [pr] ( #9196 )
...
* move viz graph to lib/graph [pr]
* add package
* share with program
2025-02-21 21:04:07 +01:00
geohotstan
6587c7879b
simple fixes to onnx ( #9195 )
...
* uncontroversial changes
* cleaner _prepare_quantize
2025-02-21 13:10:06 -05:00
Simon R
2318d7ac51
Add missing tinygrad.runtime.autogen.am to packages ( #9194 )
2025-02-21 15:38:24 +02:00
qazal
8bb80b6e5e
reorder AST matchers + comments [pr] ( #9193 )
2025-02-21 14:31:15 +01:00
qazal
2eab8021fb
remove inputs+outputs attributes from ScheduleItem [pr] ( #9192 )
...
* remove inputs/outputs from ScheduleItem
* fix test_linearizer
* fix test_conv_shapetracker
* fix test_schedule + lint
* test_image_dtype + multitensor + search
2025-02-21 13:48:11 +01:00
George Hotz
e87be0131e
torch backend start ( #9191 )
...
* start torch backend
* progress
* ugh, you need cpp crap
* 1+1 works
* 1+1 works
* becoming a real backend
* ready to merge?
2025-02-21 16:57:28 +08:00
George Hotz
d3a21cced2
hotfix: bump version to 0.10.2
v0.10.2
2025-02-21 10:43:49 +08:00
chenyu
2e7c2780a9
CLANG -> CPU ( #9189 )
2025-02-20 18:03:09 -05:00
nimlgen
f986e12f91
metal: choose compile spec based on macos ( #9188 )
...
* metal: choose compile spec based on macos
* correction
2025-02-21 00:43:39 +03:00
chenyu
3e22747799
run unit test on windows ci ( #9187 )
...
* factor out testing_minimal in setup.py [pr]
* testing_unit + windows
2025-02-20 14:40:41 -05:00
chenyu
287de4ecc6
use torch in test_gradient ( #9186 )
...
used torch.autograd.grad, but not sure if it can be a template like jax
2025-02-20 12:26:11 -05:00
qazal
574a905291
Fix running VIZ=1 after package installation + test ( #9183 )
...
* test running viz from pip install
* add pkg
* do 10 connection attempts
* include assets in package_data
* quiet curl
* better print
2025-02-20 15:02:00 +01:00
chenyu
1692087db5
_one_hot_along_dim input needs to be int ( #9179 )
...
* _one_hot_along_dim input needs to be int
indexing and onehot compare with arange, and non-int dtype is likely a bug
2025-02-20 09:00:43 -05:00
George Hotz
bf36967883
cuda hooking ( #9180 )
...
* cuda hooking
* progress
* more hook cuda
* fix params
* compile + cuMemHostAlloc hook
* work
* revert that
2025-02-20 19:20:01 +08:00
chenyu
3b37cc898b
add bert tiny config ( #9177 )
...
set with BERT_SIZE=tiny. easier to study embedding and fusion
2025-02-19 14:57:03 -05:00
qazal
5662c898f1
correctly step through bottom_up_rewrites in viz [pr] ( #9176 )
2025-02-19 19:20:57 +01:00
peppingdore
b1ddb2a1a6
fix win32 CPUProgram missing cache flush ( #9171 )
...
* win32: fix missing inst cache flush, rename ptr->self.mem for consistency with posix code
* fix types, remove assert
* fix memory leak
* rm whitespace
2025-02-19 21:38:51 +08:00
qazal
1bb9d78c7a
hotfix: add output buffer back to kernel parents + comment [pr] ( #9174 )
2025-02-19 14:22:01 +01:00
chenyu
975c318dbc
bert use int32 for input ids ( #9173 )
...
original data was int32 for these. float might have caused precision issues
2025-02-19 08:17:27 -05:00
qazal
e4a8bf28ea
scheduler cleanups + better cycle assert [pr] ( #9172 )
...
* scheduler cleanups + better cycle assert [pr]
* type_verify after assign fixup
* don't need base
* always realize sink parents
2025-02-19 13:30:58 +01:00
qazal
cf315d544b
rename can_pad arg to cache [pr] ( #9170 )
2025-02-19 12:24:59 +01:00
qazal
2fc8bf115d
remove support for VIEW with two sources in ops [pr] ( #9168 )
...
* only 1 src views can exist [pr]
* views can still exist without a base, this is a separate project
2025-02-19 11:10:18 +01:00
Ahmed Harmouche
a2afa523a0
Only add enable f16 directive if ShaderF16 is supported ( #9163 )
...
* F16 in check in wgsl renderer
* Default in renderer to fix pickle
* Refactor f16 check
2025-02-19 17:20:03 +08:00
Ahmed Harmouche
0f94b98646
Force WebGPU backend type [pr] ( #9164 )
...
* Force webgpu backend type
* Mypy fix
* Rename to WEBGPU_BACKEND
* Add it to env_vars docs
* Remove link
2025-02-19 17:19:39 +08:00
qazal
4bc708a9b0
do not create buffers we never realize in scheduler ( #9165 )
...
* work
* delete
* fix
* works
* FUSE_CONV_BW
* FUSE_ARANGE
* becomes_map
* fix assign p1
* fix assign (diamond) - 2
* fix test_assign_double_diamond_reduce
* fix subbuffer
* faster rewrite
* fix simple_pads
* start metadata work
* do some diff cleanups
* make things that can't be images not images
* openpilot fix
* fix linter
* diff
* minimal diff
* more work on the diff
* metadata
2025-02-19 10:11:47 +01:00
George Hotz
1c4e9bc363
image fixup tensor map [pr] ( #8611 )
...
Co-authored-by: qazal <qazal.software@gmail.com >
2025-02-19 10:11:06 +02:00
qazal
2a5fe3e700
whitespace changes from the map_tensors branch [pr] ( #9167 )
2025-02-19 09:52:59 +02:00