qazal
e6d20c47e3
simpler becomes_map update [pr] ( #9201 )
...
* simpler becomes_map update
* err, no metadata for device
* simpler tensor metadata mapping + tests [pr]
* remove kernel metadata
* don't map nones
* pruning
* linter
2025-02-22 20:50:58 +01:00
qazal
4578c3e8fd
simpler tensor metadata mapping + tests [pr] ( #9203 )
...
* simpler tensor metadata mapping + tests [pr]
* remove kernel metadata
* don't map nones
2025-02-22 20:18:46 +01:00
qazal
b711c6343a
no early return + allow childless const/bind/var in kernel graph [pr] ( #9202 )
2025-02-22 19:28:22 +01:00
George Hotz
97bc723538
torch backend works for ResNet-18 ( #9200 )
...
* torch backend progress, a few more functions
* resnet works
* pillow
* tv
2025-02-22 22:16:23 +08:00
George Hotz
f92820d30d
torch backend tests ( #9198 )
...
* torch backend tests
* pythonpath
* install ninja
2025-02-22 16:01:49 +08:00
George Hotz
4e6665bda5
different way to write torch backend ( #9197 )
...
* different way to write torch backend
* both backends
* more work
* simpler code
* more work
* test both
* imply unwrap/wrap
* FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works
* ready to start making test_ops work in torch backend
* backward pass, TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_add works
* FORWARD_ONLY=1 TINY_BACKEND=1 python3 test/test_ops.py TestOps.test_simple_conv2d works
* matmul backward is broken with as_strided
2025-02-22 14:42:26 +08:00
nimlgen
041b6d5678
am: load fw in batches ( #9185 )
...
* am: load fw in batches
* am: 1mb less fw copies
* mypy
* list
2025-02-21 23:21:31 +03:00
qazal
1db4341e9f
move viz graph to lib/graph [pr] ( #9196 )
...
* move viz graph to lib/graph [pr]
* add package
* share with program
2025-02-21 21:04:07 +01:00
geohotstan
6587c7879b
simple fixes to onnx ( #9195 )
...
* uncontroversial changes
* cleaner _prepare_quantize
2025-02-21 13:10:06 -05:00
Simon R
2318d7ac51
Add missing tinygrad.runtime.autogen.am to packages ( #9194 )
2025-02-21 15:38:24 +02:00
qazal
8bb80b6e5e
reorder AST matchers + comments [pr] ( #9193 )
2025-02-21 14:31:15 +01:00
qazal
2eab8021fb
remove inputs+outputs attributes from ScheduleItem [pr] ( #9192 )
...
* remove inputs/outputs from ScheduleItem
* fix test_linearizer
* fix test_conv_shapetracker
* fix test_schedule + lint
* test_image_dtype + multitensor + search
2025-02-21 13:48:11 +01:00
George Hotz
e87be0131e
torch backend start ( #9191 )
...
* start torch backend
* progress
* ugh, you need cpp crap
* 1+1 works
* 1+1 works
* becoming a real backend
* ready to merge?
2025-02-21 16:57:28 +08:00
George Hotz
d3a21cced2
hotfix: bump version to 0.10.2
v0.10.2
2025-02-21 10:43:49 +08:00
chenyu
2e7c2780a9
CLANG -> CPU ( #9189 )
2025-02-20 18:03:09 -05:00
nimlgen
f986e12f91
metal: choose compile spec based on macos ( #9188 )
...
* metal: choose compile spec based on macos
* correction
2025-02-21 00:43:39 +03:00
chenyu
3e22747799
run unit test on windows ci ( #9187 )
...
* factor out testing_minimal in setup.py [pr]
* testing_unit + windows
2025-02-20 14:40:41 -05:00
chenyu
287de4ecc6
use torch in test_gradient ( #9186 )
...
used torch.autograd.grad, but not sure if it can be a template like jax
2025-02-20 12:26:11 -05:00
qazal
574a905291
Fix running VIZ=1 after package installation + test ( #9183 )
...
* test running viz from pip install
* add pkg
* do 10 connection attempts
* include assets in package_data
* quiet curl
* better print
2025-02-20 15:02:00 +01:00
chenyu
1692087db5
_one_hot_along_dim input needs to be int ( #9179 )
...
* _one_hot_along_dim input needs to be int
indexing and onehot compare with arange, and non-int dtype is likely a bug
2025-02-20 09:00:43 -05:00
George Hotz
bf36967883
cuda hooking ( #9180 )
...
* cuda hooking
* progress
* more hook cuda
* fix params
* compile + cuMemHostAlloc hook
* work
* revert that
2025-02-20 19:20:01 +08:00
chenyu
3b37cc898b
add bert tiny config ( #9177 )
...
set with BERT_SIZE=tiny. easier to study embedding and fusion
2025-02-19 14:57:03 -05:00
qazal
5662c898f1
correctly step through bottom_up_rewrites in viz [pr] ( #9176 )
2025-02-19 19:20:57 +01:00
peppingdore
b1ddb2a1a6
fix win32 CPUProgram missing cache flush ( #9171 )
...
* win32: fix missing inst cache flush, rename ptr->self.mem for consistency with posix code
* fix types, remove assert
* fix memory leak
* rm whitespace
2025-02-19 21:38:51 +08:00
qazal
1bb9d78c7a
hotfix: add output buffer back to kernel parents + comment [pr] ( #9174 )
2025-02-19 14:22:01 +01:00
chenyu
975c318dbc
bert use int32 for input ids ( #9173 )
...
original data was int32 for these. float might have caused precision issues
2025-02-19 08:17:27 -05:00
qazal
e4a8bf28ea
scheduler cleanups + better cycle assert [pr] ( #9172 )
...
* scheduler cleanups + better cycle assert [pr]
* type_verify after assign fixup
* don't need base
* always realize sink parents
2025-02-19 13:30:58 +01:00
qazal
cf315d544b
rename can_pad arg to cache [pr] ( #9170 )
2025-02-19 12:24:59 +01:00
qazal
2fc8bf115d
remove support for VIEW with two sources in ops [pr] ( #9168 )
...
* only 1 src views can exist [pr]
* views can still exist without a base, this is a separate project
2025-02-19 11:10:18 +01:00
Ahmed Harmouche
a2afa523a0
Only add enable f16 directive if ShaderF16 is supported ( #9163 )
...
* F16 in check in wgsl renderer
* Default in renderer to fix pickle
* Refactor f16 check
2025-02-19 17:20:03 +08:00
Ahmed Harmouche
0f94b98646
Force WebGPU backend type [pr] ( #9164 )
...
* Force webgpu backend type
* Mypy fix
* Rename to WEBGPU_BACKEND
* Add it to env_vars docs
* Remove link
2025-02-19 17:19:39 +08:00
qazal
4bc708a9b0
do not create buffers we never realize in scheduler ( #9165 )
...
* work
* delete
* fix
* works
* FUSE_CONV_BW
* FUSE_ARANGE
* becomes_map
* fix assign p1
* fix assign (diamond) - 2
* fix test_assign_double_diamond_reduce
* fix subbuffer
* faster rewrite
* fix simple_pads
* start metadata work
* do some diff cleanups
* make things that can't be images not images
* openpilot fix
* fix linter
* diff
* minimal diff
* more work on the diff
* metadata
2025-02-19 10:11:47 +01:00
George Hotz
1c4e9bc363
image fixup tensor map [pr] ( #8611 )
...
Co-authored-by: qazal <qazal.software@gmail.com >
2025-02-19 10:11:06 +02:00
qazal
2a5fe3e700
whitespace changes from the map_tensors branch [pr] ( #9167 )
2025-02-19 09:52:59 +02:00
qazal
a773ff73e3
match image cast folding on the cast itself [pr] ( #9166 )
2025-02-19 09:31:34 +02:00
qazal
9a20063837
create subbuffer immediately before constructing ScheduleItem [pr] ( #9162 )
2025-02-18 21:07:52 +01:00
qazal
1c92534bff
hotfix: viz should show if there's a rewrite [pr] ( #9161 )
2025-02-18 19:11:03 +01:00
George Hotz
a330f3338c
save applied opts in ProgramSpec [pr] ( #9150 )
2025-02-19 00:40:03 +08:00
chenyu
ff05bff221
put bert data shard inside jit ( #9160 )
...
python time 45ms -> 9ms, it was spending time to schedule the shard
also init bert data on CLANG since it's from numpy, so we don't create the tensor on default device then shard into GPUS
2025-02-18 10:36:54 -05:00
qazal
679291e26a
assert only base maps to buffer [pr] ( #9159 )
2025-02-18 15:46:47 +01:00
qazal
4f592eeea6
hotfix: remove extra matcher for copy/buffer_view [pr] ( #9157 )
2025-02-18 13:21:24 +01:00
George Hotz
ff9b985d9f
hotfix: View Base AST
2025-02-18 18:48:34 +08:00
George Hotz
30f470eaa3
UNIQUE UOp for buffer instead of arg ( #9156 )
...
* UNIQUE UOp for buffer instead of arg
* factor out buffer spec
2025-02-18 16:59:59 +08:00
qazal
38f5ea2132
increment writable buffers refcount from the kernel graph [pr] ( #9153 )
2025-02-18 10:20:02 +02:00
George Hotz
ddddcc165b
colors back in DEBUG=2 [pr] ( #9155 )
2025-02-18 16:17:57 +08:00
George Hotz
6d62966bf7
add support for named rewrites [pr] ( #9152 )
2025-02-18 16:07:04 +08:00
George Hotz
caee42e8a6
Revert "name from uops [pr] ( #9151 )" ( #9154 )
...
This reverts commit 28897be9a2 .
2025-02-18 16:06:44 +08:00
George Hotz
28897be9a2
name from uops [pr] ( #9151 )
2025-02-18 15:52:03 +08:00
George Hotz
a4dab3ec3f
add name uop ( #9149 )
...
* add name uop, TODO: refactor renderer to use
* renderer uses name uop
* fix tests
* render
* ptx
2025-02-18 15:26:58 +08:00
George Hotz
2db8b4046a
minor linearizer refactor to finalize in rewrite [pr] ( #9148 )
2025-02-18 12:42:22 +08:00