Commit Graph

345 Commits

Author SHA1 Message Date
chenyu
ae51bdd06a remove trivial use of RANGEIFY flag (#12550)
some tests need update still
2025-10-09 02:29:38 -04:00
qazal
b6835f4134 remove Ops.VIEW and related UOp methods (#12522)
* remove Ops.VIEW and related UOp methods

* update abstractions2.py

* no ShapeTrackers in abstractions2.py

* it's a size 1
2025-10-08 14:47:02 +03:00
qazal
7e0b14243e delete grouper and kernelize (#12517)
* delete grouper and kernelize

* +sys.setrecursionlimit
2025-10-08 12:27:26 +03:00
b1tg
154d114364 rangeify: fix abstractions2.py (#12386)
* rangeify: fix abstractions2.py

* tests

* lint

* only abstractions2

* base
2025-10-01 09:58:56 +03:00
nimlgen
b63bd02969 update runtime docs (#12191) 2025-09-15 17:46:20 +03:00
qazal
a388d2cb1a remove PROFILE=1 option, it's just VIZ=1 [pr] (#12176)
* remove PROFILE=1 option, it's just VIZ=1 [pr]

* sqtt

* sqtt 2

* return last

* rename
2025-09-15 12:51:50 +03:00
chenyu
0e266f376c ops_gpu -> ops_cl (#12103) 2025-09-10 15:15:48 -04:00
nimlgen
1c6c42715f unify cpu and llvm (#11982)
* try unify cpu and llvm

* fixes

* fix

* ops

* no llvm

* fix

* rm

* lvmm is ot

* oops

* override

* no llvm

* ignore

* skip llvm

* ooops
2025-09-09 13:54:44 +03:00
George Hotz
09106e4aae refactor and split test_linearizer (#12001)
* refactor and split test_linearizer

* forget that file

* imports

* remove from docs

* test gen float4
2025-09-04 10:53:07 -07:00
George Hotz
a03b930339 hotfix: green v2 in docs 2025-08-24 10:25:14 -07:00
chenyu
fb8ee02424 Tensor.logaddexp (#11793) 2025-08-23 09:15:00 -04:00
wozeparrot
1826004ef9 feat: add tinyos builder link (#11570) 2025-08-07 17:42:18 -04:00
George Hotz
82be8abfd2 move opt under codegen (#11569) 2025-08-07 14:19:17 -07:00
chenyu
83385e7abc update gradient src in ramp.py (#11499)
that's simplified now
2025-08-04 18:58:03 -04:00
George Hotz
842184a1ab rename kernelize to schedule, try 2 (#11305) 2025-07-21 11:18:36 -07:00
nimlgen
cc3c1e4c14 hcq: move cpu to hcq (#11262)
* hcq: move cpu to hcq

* import time

* upd

* fix

* windows support

* hm

* cleaner

* fix timer

* fix timing

* std is ns

* skip profiler

* mypy

* cleaner

* cleanups

* after merge

* default is back
2025-07-21 15:10:38 +03:00
nimlgen
9a88bd841c hcq: refactor into peer_groups (#11277)
* hcq: refactor into peer_groups

* fix fors

* fixes

* ooops

* mypy

* tiny fixes
2025-07-18 16:34:18 +03:00
chenyu
845a4d32bc Tensor.diag (#11108)
also updated Tensor.eye to use it
2025-07-05 23:03:02 -04:00
Ahmed Harmouche
e992ed10dc WebGPU on Windows (#10890)
* WebGPU on Windows

* Fix dawn-python install

* New test

* pydeps

* Minor fix

* Only install dawn-python on windows webgpu

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2025-07-02 08:38:45 -07:00
chenyu
18e264a449 Tensor.logsigmoid (#10955) 2025-06-24 11:16:14 -04:00
George Hotz
b09c47366f opt transforms the ast into an optimized ast (#10900)
* opt transforms the ast into an optimized ast

* fix get_kernel order and to_function_name

* function_name property

* update docs

* copy from kernel.py

* improve docs

* ci didn't trigger?
2025-06-22 09:41:26 -07:00
George Hotz
7636d2cdc5 flip order of get_program args (#10905) 2025-06-20 17:23:23 -07:00
George Hotz
1ce63f8d04 move functions to view and update docs [pr] (#10904)
* move functions to view and update docs [pr]

* move quantize
2025-06-20 16:47:58 -07:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
cba6e15937 split grouper and kernelize [pr] (#10854) 2025-06-17 17:54:20 -07:00
George Hotz
5dc1bc6070 switch get_kernel -> get_program [pr] (#10817)
* switch get_kernel -> get_program [pr]

* fix tests
2025-06-15 12:26:50 -07:00
Dan German
24e7aed74b ramp.py: correct UOp and Ops import path from tinygrad.uop to tinygrad.uop.ops (#10791) 2025-06-12 10:07:03 -04:00
George Hotz
32e9949052 rename lazydata to uop (#10698) 2025-06-08 08:42:22 -07:00
George Hotz
db01c5a08a ramp.py file from stream (#10686) 2025-06-07 14:58:21 -07:00
George Hotz
5ef7c5923f docs: remove unused METAL_XCODE env var (#10421) 2025-06-06 18:39:54 -04:00
Eitan Turok
61352b8aa2 Add some more docs (#10634)
* more docs

* Add multinomial to ops

* better doc
2025-06-05 19:40:37 -04:00
qazal
5b59728c75 refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) (#10541)
* changes to core tinygrad

* fixups pt1

TC=3
docs/abstractions2.py
IMAGE=2
test_quantize_dsp
test_schedule

* more tests

* green now

* images stay images
2025-05-30 14:27:58 +03:00
Eitan Turok
c07f13c438 Docs for masked_fill (#10558)
* add docs

* fix doc examples

* add to docs

* fix typo
2025-05-29 03:49:02 -07:00
geohotstan
602a145f8f Add Tensor.unfold (#10518)
* yoinked 10272

* eitanturok's fixes

* hmmm should size be sint?

* add test
2025-05-26 11:15:44 -04:00
George Hotz
147f7747f2 remove the map from create_schedule_with_vars [pr] (#10472) 2025-05-22 15:58:25 -07:00
George Hotz
0d39bb5de1 rename to get_kernelize_map (#10465) 2025-05-22 11:44:44 -07:00
George Hotz
411392dfb7 move files into uop dir (#10399)
* move files into uop dir [pr]

* tinygrad.uop is a thing

* fix uop docs, no pr

* fix viz
2025-05-18 11:38:28 -07:00
George Hotz
6ebfb505e9 docs: fix crossentropy name (#10377) 2025-05-17 16:39:14 -07:00
Elnur Rakhmatullin
de2b323d97 Fixed a typo in "simplify" (#10358) 2025-05-16 14:45:14 -07:00
chenyu
8a906cb124 Tensor.randn_like (#10276) 2025-05-13 11:53:59 -04:00
nimlgen
b583ece8f3 amd: replace AMD_DRIVERLESS with AMD_IFACE (#10116)
* amd: replace AMD_DRIVERLESS with AMD_IFACE

* docs

* print direct err for amd_iface

* print for all
2025-04-30 20:22:02 +03:00
qazal
0bee225a58 Tensor.kernelize docs (#9946)
* Tensor.kernelize docs

* syntax

* test_kernelize_bw

* Tensor.kernelize docstring

* pruning

* tiny details

* details 2

* becomes_map terminology

* more changes to becomes
2025-04-21 16:34:03 +08:00
qazal
e20ef7196a Tensor.kernelize (#9845)
* add kernelize

* remove that

* kernelize returns self

* update abstractions2.py

* kernelize in test_schedule

* temp: assert BUFFER_VIEW's existence

* ASSIGN must have a buffer or subbuffer target

* assert and shrink

* fix

* padded setitem

* var

* toposort once

* extra

* base_buffer

* end with BUFFER_VIEW

* setitem for disk

* test_setitem_becomes_subbuffer

* mul slice test

* torch backend fix 1

* non-deterministic

* keep subbuffer
2025-04-20 20:53:49 +08:00
qazal
218e01833d update scheduler section for abstractions2.py [pr] (#9927) 2025-04-19 12:09:14 +03:00
Alexey Zaytsev
78a6af3da7 Use $CUDA_PATH/include for CUDA headers (#9858) 2025-04-13 16:20:19 +01:00
nimlgen
23b67f532c amd: minor comments and readme updates (#9865) 2025-04-12 23:24:05 +03:00
qazal
f2bd65ccfc delete Ops.EMPTY and Tensor._metaop (#9715)
* delete Ops.EMPTY and Tensor._metaop [pr]

* test_creation

* arg=

* abstractions2
2025-04-03 12:29:02 +08:00
Ignacio Sica
876a8be97a Debug env var breakdown (#9663)
* add debug level breakdown

* hotfix

* Update env_vars.md
2025-04-02 14:34:07 +08:00
chenyu
162f286a0e add a few Tensor method to doc (#9614)
* add a few Tensor method to doc

* clone
2025-03-28 13:47:16 -04:00
uuuvn
c631c72f22 HCQ: Increment timeline signal before submitting (#9550)
`AMDComputeQueue.__del__` frees `hw_page` which is safe because
`AMDAllocator._free` does `self.dev.synchronize()` which is supposed
to wait for execution of IB to finish, however that doesn't happen if
AMDComputeQueue is dropped right after submit before timeline signal is
incremented, which it is in most places leading to a race if .bind() is
also used (required for multi-xcc because bug in mec fw treats all
PACKET3_PRED_EXECs outside IBs as if they had EXEC_COUNT of zero).
2025-03-23 18:30:38 +07:00