Commit Graph

10490 Commits

Author SHA1 Message Date
qazal
452b22c9b6 fix process replay diff in PYTHON device [pr] (#11052)
* fix process replay diff in PYTHON device [pr]

The PYTHON backend pickles and encodes UOps, the encoded binary can't be
directly diffed in process replay.

* note
2025-07-02 11:06:46 +03:00
geohotstan
8ebf0abaae ONNX external_test_onnx_backend use PYTHON device for model (#10915)
* try

* ruff check --fix

* no skip test

* hmmmmmmm I don't get this D:

* run CI again

* why is PYTHON device faster than CPU?

* run ci again and fix lint

* actually doesn't PYTHON device make sense here?

* see cpu speed again

* Revert "see cpu speed again"

This reverts commit 1e366f2256.

* trigger CI

* pretty good

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-07-01 12:11:17 -04:00
qazal
8b0871ac31 viz: test for no lockup on infinite loop (#11041)
* viz: add test infinite loop fallback

* assert

* continue til the end

* work

* bring that back

* fallback to nop
2025-07-01 17:44:20 +03:00
b1tg
fcbefde8f5 fix DiskDevice reuse (#11039)
* fix DiskDevice reuse

* fix mypy and DiskDevice.count

* mypy

* add test

---------

Co-authored-by: b1tg <b1tg@users.noreply.github.com>
2025-07-01 10:29:21 -04:00
George Hotz
5628e2054c hotfix: if no ranges, return None 2025-06-30 18:07:56 -07:00
George Hotz
0597735f28 remove TC=3 not porting this (#11045) 2025-06-30 15:12:49 -07:00
George Hotz
cccfe6b422 hotfix: test_no_inf_loop_bottom_up 2025-06-30 14:21:45 -07:00
George Hotz
752c76ceb7 tc3 shape expand [pr] (#11043)
* tc3 shape expand [pr]

* remove unused stuff in lowerer
2025-06-30 13:38:14 -07:00
George Hotz
539b17fcbf expand local shape so shapes work [pr] (#11042) 2025-06-30 13:03:31 -07:00
nimlgen
9ea7deb515 hcq: select_iface shared (#11033)
* hcq: select_iface shared

* errs

* sorry

* upprt
2025-06-30 21:12:39 +03:00
qazal
013085da7d viz: only path "/" serves the UI (#11037)
The dict used to exist for /profiler and main localhost:8000, we don't
need it anymore.
2025-06-30 19:10:33 +03:00
George Hotz
b829331219 infinite loop detect in fixed_point_rewrite [pr] (#11038) 2025-06-30 08:57:29 -07:00
Nino Risteski
bc15e98f5c clean up unused imports in examples and update CI linting (#11024)
* clean up unused imports in examples

* enable unused import checking in examples

* lint

* ignore F541 and F841 - focus on unused imports only

* clean up

* restore tinygrad.frontend.torch for TINY_BACKEND

* tiny change
2025-06-30 08:21:27 -07:00
George Hotz
cb531dba42 detect infinite loop in graph rewrite [pr] (#11036) 2025-06-30 08:15:13 -07:00
qazal
710d734ce7 viz: don't need PICKLE_BUFFER=0 in capture (#11031) 2025-06-30 16:20:04 +03:00
qazal
2ea4737930 viz: fix newlines breaking label colors (#11030)
* viz: fix newlines breaking label colors

* TestViz.test_colored_label

* TestWordWrap
2025-06-30 13:39:44 +03:00
George Hotz
5911b71404 early support for bidirectional pattern matcher (#11027)
* early support for bidirectional pattern matcher

* expose it and add a test

* no bottom up arg there

* disable flaky test
2025-06-29 16:54:07 -07:00
George Hotz
ec1d97191d minor cleanup to lowerer [pr] (#11026)
* minor cleanup to lowerer [pr]

* add that rule to sym
2025-06-29 11:01:29 -07:00
Piyush
454bc3393d redundant code (#11014) 2025-06-29 09:06:10 -07:00
qazal
19b11cb778 hotfix: check canvas exists before access (#11022) 2025-06-29 14:44:14 +03:00
chenyu
126fcf4129 clean up AMD_LLVM in tests (#11021) 2025-06-28 22:45:47 -04:00
qazal
cb6a66ea84 viz: remove per schedule renderMemoryGraph (#11019)
replaced with per device Buffer viz https://github.com/tinygrad/tinygrad/pull/10960
2025-06-28 22:09:38 +03:00
qazal
4c8d2a0383 buffer viz (#10960)
* add mem_layout

* ui

* cleanup

* work

* debugLine work and expander

* tooltip style

* real expand device

* wheel does one thing

* diff

* shows llama oom

* add y axis

* mypy chill

* work

* unittests for the memory layout
2025-06-28 21:50:32 +03:00
qazal
e3d024afa0 viz: split into scale, shapes, axes last (#11018)
* viz: split into scale, shapes, axes last

* set zoom on render
2025-06-28 19:10:58 +03:00
qazal
508bc68078 viz: small fixups from memory graph (#11017)
* don't need div.id

* tooltip z-index
2025-06-28 16:34:14 +03:00
qazal
fc3e509822 viz: new canvas on first render (#11016) 2025-06-28 16:04:51 +03:00
chenyu
c14c9a8eff llama3 grad clip (#11003) 2025-06-27 19:14:12 -04:00
nimlgen
e53673a0b2 amd: sdma queue overrun fix (#11012)
* amd: sdma queue overrun fix

* add ()

* fix

* bug

* this is correct
2025-06-28 01:42:03 +03:00
chenyu
f2548afeb5 bert grad clipping start with const 0 (#11008)
saved the init kernels
2025-06-27 18:02:23 -04:00
chenyu
a6485d00c8 very tiny generate_dataset (#11013)
one minute to gen on my mac
2025-06-27 17:10:45 -04:00
qazal
382fa6a325 viz: support axis colors in UOp nodes (#11009)
* work

* javascript

* optional defaultColor

* fine
2025-06-27 23:02:55 +03:00
qazal
44257f25e4 bump line count to 14600 (#11010) 2025-06-27 22:48:14 +03:00
George Hotz
be53ef4f0a rename DEFINE_ACC -> DEFINE_REG (#11006)
* rename DEFINE_ACC -> DEFINE_REG

* add CMPEQ to groupops
2025-06-27 11:09:25 -07:00
George Hotz
05c35d0db8 reorder ops and add comments (#11005) 2025-06-27 10:52:14 -07:00
George Hotz
5a1911b7c4 apply the global dims late (#11002)
* apply the global dims late [pr]

* late gpudims

* tests passing

* remove the random local_dims inc

* simpler
2025-06-27 09:54:34 -07:00
qazal
4ef10c57f9 remove unused test helper (#10999) 2025-06-27 13:48:48 +03:00
qazal
a39343e39f viz: move timeline layout to python (#10998)
* viz: move timeline layout to python

* DevEvent has a device and a name
2025-06-27 13:06:00 +03:00
George Hotz
b4eb876d5a kernel.py no longer permutes reduce axis [pr] (#10968)
* kernel.py no longer permutes reduce axis [pr]

* delete tests that handcode uops

* regen of sops is broken...

* put import back

* just remove that

* disable those tests
2025-06-26 17:44:58 -07:00
chenyu
6ab5a5cb6c llama3 mlperf train (#10983)
work in progress. now it can overfit small examples and vram roughly matches
2025-06-26 20:24:27 -04:00
George Hotz
856759c79c add halide example (#10980)
* add halide example

* upd halide gemm

* partial works

* touchups
2025-06-26 16:14:57 -07:00
qazal
1127302c46 move perfetto to extra (#10994)
* move perfetto to extra

* update TestViz and fix tests

* remove perfetto.html from viz directory

* work

* mypy
2025-06-27 01:53:54 +03:00
qazal
712980e167 fix extract_dataset + add tests to CI (#10995)
* fix extract_dataset + tests

* add CI

* sops.gz itself is same as master

* yml + gzip -c + ge

* don't commit that

* bump limit to 1000

* axis=7

* test_tiny
2025-06-27 01:51:36 +03:00
chenyu
4572e65f0f remove duplicated move_early logic in UOp.r [pr] (#10993) 2025-06-26 18:33:54 -04:00
Ignacio Sica
579194f523 remove some linearize calls from tests 2 [pr] (#10992)
* refactor count_float4 to take uops as input instead of kernel

* remove some calls to linearize in test_linearizer

* remove some more calls

* remove one more call
2025-06-26 18:22:27 -03:00
geohotstan
50936b4a18 ONNX real float16 (#10694)
* squash commits

* temp fix for const tensor

* actually realizing float16 can only happen in raw_data

* .float -> cast(float) to rerun CI

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-26 14:05:12 -04:00
qazal
73484b0803 viz: generic shape tooltip/click handlers + renames (#10990)
* viz: generic tooltip

* assign kernel

* labelParts/label

* rect with a fillColor

* line
2025-06-26 19:14:04 +03:00
qazal
7f79c1388f viz: update y offset calculation (#10987)
* viz: update y offset calculation

* don't rescale padding
2025-06-26 12:05:20 +03:00
chenyu
49bba2f0a0 improve test_nll_loss (#10986)
build target and weight tensors outside so it tests backward too.
2025-06-26 02:46:55 -04:00
chenyu
0612acfc70 improve Tensor.cross_entropy (#10985)
separate when Y is prob vs indices and check shapes for indices. also fix higher dim cases
2025-06-26 01:39:48 -04:00
chenyu
8751d47985 CosineAnnealingLRWithWarmup (#10981) 2025-06-25 17:45:21 -04:00