qazal
452b22c9b6
fix process replay diff in PYTHON device [pr] ( #11052 )
...
* fix process replay diff in PYTHON device [pr]
The PYTHON backend pickles and encodes UOps, the encoded binary can't be
directly diffed in process replay.
* note
2025-07-02 11:06:46 +03:00
geohotstan
8ebf0abaae
ONNX external_test_onnx_backend use PYTHON device for model ( #10915 )
...
* try
* ruff check --fix
* no skip test
* hmmmmmmm I don't get this D:
* run CI again
* why is PYTHON device faster than CPU?
* run ci again and fix lint
* actually doesn't PYTHON device make sense here?
* see cpu speed again
* Revert "see cpu speed again"
This reverts commit 1e366f2256 .
* trigger CI
* pretty good
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-07-01 12:11:17 -04:00
qazal
8b0871ac31
viz: test for no lockup on infinite loop ( #11041 )
...
* viz: add test infinite loop fallback
* assert
* continue til the end
* work
* bring that back
* fallback to nop
2025-07-01 17:44:20 +03:00
b1tg
fcbefde8f5
fix DiskDevice reuse ( #11039 )
...
* fix DiskDevice reuse
* fix mypy and DiskDevice.count
* mypy
* add test
---------
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-07-01 10:29:21 -04:00
George Hotz
5628e2054c
hotfix: if no ranges, return None
2025-06-30 18:07:56 -07:00
George Hotz
0597735f28
remove TC=3 not porting this ( #11045 )
2025-06-30 15:12:49 -07:00
George Hotz
cccfe6b422
hotfix: test_no_inf_loop_bottom_up
2025-06-30 14:21:45 -07:00
George Hotz
752c76ceb7
tc3 shape expand [pr] ( #11043 )
...
* tc3 shape expand [pr]
* remove unused stuff in lowerer
2025-06-30 13:38:14 -07:00
George Hotz
539b17fcbf
expand local shape so shapes work [pr] ( #11042 )
2025-06-30 13:03:31 -07:00
nimlgen
9ea7deb515
hcq: select_iface shared ( #11033 )
...
* hcq: select_iface shared
* errs
* sorry
* upprt
2025-06-30 21:12:39 +03:00
qazal
013085da7d
viz: only path "/" serves the UI ( #11037 )
...
The dict used to exist for /profiler and main localhost:8000, we don't
need it anymore.
2025-06-30 19:10:33 +03:00
George Hotz
b829331219
infinite loop detect in fixed_point_rewrite [pr] ( #11038 )
2025-06-30 08:57:29 -07:00
Nino Risteski
bc15e98f5c
clean up unused imports in examples and update CI linting ( #11024 )
...
* clean up unused imports in examples
* enable unused import checking in examples
* lint
* ignore F541 and F841 - focus on unused imports only
* clean up
* restore tinygrad.frontend.torch for TINY_BACKEND
* tiny change
2025-06-30 08:21:27 -07:00
George Hotz
cb531dba42
detect infinite loop in graph rewrite [pr] ( #11036 )
2025-06-30 08:15:13 -07:00
qazal
710d734ce7
viz: don't need PICKLE_BUFFER=0 in capture ( #11031 )
2025-06-30 16:20:04 +03:00
qazal
2ea4737930
viz: fix newlines breaking label colors ( #11030 )
...
* viz: fix newlines breaking label colors
* TestViz.test_colored_label
* TestWordWrap
2025-06-30 13:39:44 +03:00
George Hotz
5911b71404
early support for bidirectional pattern matcher ( #11027 )
...
* early support for bidirectional pattern matcher
* expose it and add a test
* no bottom up arg there
* disable flaky test
2025-06-29 16:54:07 -07:00
George Hotz
ec1d97191d
minor cleanup to lowerer [pr] ( #11026 )
...
* minor cleanup to lowerer [pr]
* add that rule to sym
2025-06-29 11:01:29 -07:00
Piyush
454bc3393d
redundant code ( #11014 )
2025-06-29 09:06:10 -07:00
qazal
19b11cb778
hotfix: check canvas exists before access ( #11022 )
2025-06-29 14:44:14 +03:00
chenyu
126fcf4129
clean up AMD_LLVM in tests ( #11021 )
2025-06-28 22:45:47 -04:00
qazal
cb6a66ea84
viz: remove per schedule renderMemoryGraph ( #11019 )
...
replaced with per device Buffer viz https://github.com/tinygrad/tinygrad/pull/10960
2025-06-28 22:09:38 +03:00
qazal
4c8d2a0383
buffer viz ( #10960 )
...
* add mem_layout
* ui
* cleanup
* work
* debugLine work and expander
* tooltip style
* real expand device
* wheel does one thing
* diff
* shows llama oom
* add y axis
* mypy chill
* work
* unittests for the memory layout
2025-06-28 21:50:32 +03:00
qazal
e3d024afa0
viz: split into scale, shapes, axes last ( #11018 )
...
* viz: split into scale, shapes, axes last
* set zoom on render
2025-06-28 19:10:58 +03:00
qazal
508bc68078
viz: small fixups from memory graph ( #11017 )
...
* don't need div.id
* tooltip z-index
2025-06-28 16:34:14 +03:00
qazal
fc3e509822
viz: new canvas on first render ( #11016 )
2025-06-28 16:04:51 +03:00
chenyu
c14c9a8eff
llama3 grad clip ( #11003 )
2025-06-27 19:14:12 -04:00
nimlgen
e53673a0b2
amd: sdma queue overrun fix ( #11012 )
...
* amd: sdma queue overrun fix
* add ()
* fix
* bug
* this is correct
2025-06-28 01:42:03 +03:00
chenyu
f2548afeb5
bert grad clipping start with const 0 ( #11008 )
...
saved the init kernels
2025-06-27 18:02:23 -04:00
chenyu
a6485d00c8
very tiny generate_dataset ( #11013 )
...
one minute to gen on my mac
2025-06-27 17:10:45 -04:00
qazal
382fa6a325
viz: support axis colors in UOp nodes ( #11009 )
...
* work
* javascript
* optional defaultColor
* fine
2025-06-27 23:02:55 +03:00
qazal
44257f25e4
bump line count to 14600 ( #11010 )
2025-06-27 22:48:14 +03:00
George Hotz
be53ef4f0a
rename DEFINE_ACC -> DEFINE_REG ( #11006 )
...
* rename DEFINE_ACC -> DEFINE_REG
* add CMPEQ to groupops
2025-06-27 11:09:25 -07:00
George Hotz
05c35d0db8
reorder ops and add comments ( #11005 )
2025-06-27 10:52:14 -07:00
George Hotz
5a1911b7c4
apply the global dims late ( #11002 )
...
* apply the global dims late [pr]
* late gpudims
* tests passing
* remove the random local_dims inc
* simpler
2025-06-27 09:54:34 -07:00
qazal
4ef10c57f9
remove unused test helper ( #10999 )
2025-06-27 13:48:48 +03:00
qazal
a39343e39f
viz: move timeline layout to python ( #10998 )
...
* viz: move timeline layout to python
* DevEvent has a device and a name
2025-06-27 13:06:00 +03:00
George Hotz
b4eb876d5a
kernel.py no longer permutes reduce axis [pr] ( #10968 )
...
* kernel.py no longer permutes reduce axis [pr]
* delete tests that handcode uops
* regen of sops is broken...
* put import back
* just remove that
* disable those tests
2025-06-26 17:44:58 -07:00
chenyu
6ab5a5cb6c
llama3 mlperf train ( #10983 )
...
work in progress. now it can overfit small examples and vram roughly matches
2025-06-26 20:24:27 -04:00
George Hotz
856759c79c
add halide example ( #10980 )
...
* add halide example
* upd halide gemm
* partial works
* touchups
2025-06-26 16:14:57 -07:00
qazal
1127302c46
move perfetto to extra ( #10994 )
...
* move perfetto to extra
* update TestViz and fix tests
* remove perfetto.html from viz directory
* work
* mypy
2025-06-27 01:53:54 +03:00
qazal
712980e167
fix extract_dataset + add tests to CI ( #10995 )
...
* fix extract_dataset + tests
* add CI
* sops.gz itself is same as master
* yml + gzip -c + ge
* don't commit that
* bump limit to 1000
* axis=7
* test_tiny
2025-06-27 01:51:36 +03:00
chenyu
4572e65f0f
remove duplicated move_early logic in UOp.r [pr] ( #10993 )
2025-06-26 18:33:54 -04:00
Ignacio Sica
579194f523
remove some linearize calls from tests 2 [pr] ( #10992 )
...
* refactor count_float4 to take uops as input instead of kernel
* remove some calls to linearize in test_linearizer
* remove some more calls
* remove one more call
2025-06-26 18:22:27 -03:00
geohotstan
50936b4a18
ONNX real float16 ( #10694 )
...
* squash commits
* temp fix for const tensor
* actually realizing float16 can only happen in raw_data
* .float -> cast(float) to rerun CI
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-06-26 14:05:12 -04:00
qazal
73484b0803
viz: generic shape tooltip/click handlers + renames ( #10990 )
...
* viz: generic tooltip
* assign kernel
* labelParts/label
* rect with a fillColor
* line
2025-06-26 19:14:04 +03:00
qazal
7f79c1388f
viz: update y offset calculation ( #10987 )
...
* viz: update y offset calculation
* don't rescale padding
2025-06-26 12:05:20 +03:00
chenyu
49bba2f0a0
improve test_nll_loss ( #10986 )
...
build target and weight tensors outside so it tests backward too.
2025-06-26 02:46:55 -04:00
chenyu
0612acfc70
improve Tensor.cross_entropy ( #10985 )
...
separate when Y is prob vs indices and check shapes for indices. also fix higher dim cases
2025-06-26 01:39:48 -04:00
chenyu
8751d47985
CosineAnnealingLRWithWarmup ( #10981 )
2025-06-25 17:45:21 -04:00