Commit Graph

10633 Commits

Author SHA1 Message Date
George Hotz
75503955bf simple schedule test [pr] (#10853) 2025-06-17 16:19:27 -07:00
chenyu
075a74cf25 add global_batch_size to mlperf bert (#10852)
global_batch_size = grad_acc_steps * batch_size. no-op change to prep grad acc for bert
2025-06-17 17:54:15 -04:00
uuuvn
a51f18f8f9 CI flakiness (#10851)
https://github.com/tinygrad/tinygrad/actions/runs/15718103629/job/44292845140?pr=10753#step:4:161
2025-06-17 14:46:30 -07:00
qazal
e77cd81662 time viz (#10763)
* work

* basic stuff

* work

* also reset

* moving through time

* cleanup

* proper zoom

* add livereload.js

pip install livereload
livereload tinygrad/viz

* minor

* fixed width, remove viewbox

* bit of flexbox magic

* show pid/tid

* merge loops

* min-height

* redo some layout stuff

* create cell groups

* text is hard

* javascript Math.min causes "Maximum call stack size"
bert repro: VIZ=1 PYTHONPATH=. DEFAULT_FLOAT=HALF BS=66 GPUS=6 BERT_LAYERS=2 FUSE_ARANGE=1 MODEL=bert python3 examples/mlperf/model_train.py

* fix recursion issue

* no viz/server changes

* fix test_viz

* everything is a g

* text is easy

* no it's still hard

* livereload+notes

* height: 100% fixes the device bug

* start canvas work

* base canvas

* take chrome's stuff

* serve chrome's thing

* fetch traces from get_profile

* remove junk

* remove some more

* bring everything back again

* dispatch resize events

* base ticks

* hook d3.zoom

* zoom on the x axis

* bring filter back, makes ctrl+drag possible

* remove junk

* Revert "remove junk"

This reverts commit 4987e7bec1.

* draws something, the zooms aren't right

* move to canvas

* fix zooming

* Revert "Revert "remove junk""

This reverts commit 5aac2034fb.

* space key resets zoom

* Divide timelines by device on y axis

* Show kernel names when the width allows

* Clicking on kernel opens it in the kernel graph

* remove livereload.js

* reset diff

* base diff:

- fetch traceEvents
- displayGraph
- flexbox layout
- rest of canvas

* rescale in-place is faster, d3's rescaleX creates a copy

* less

* aesthetics

* map names

* first viz is profiler

* this will work when i make canvas once

* initial cleanups

* factor out of loop

* refactor + only show devices with events

* properly align program rects

* cleaner tick lines

* padding

* listen for resize

* simple zoom

* space more

* i always end up making zoom globl

* how is this ever allowed

* clicking works again

* back button goes back to the same zoom level

* work

* more work

* bring that back

* coloring work and simplify

* black

* keep perfetto button for comparison

* better

* ph===X

* simplify history stuff

* temp: handcoded

* test: flamegraph style leveling

* Revert "temp: handcoded"

This reverts commit bdcd538e88.

* disable flamegraph

* group by pid

* factor y and height out of render

* now flamegraph is easy

* livereload stuff

* remove that

* less
2025-06-17 19:39:34 +03:00
qazal
9e2cb7522a viz: define launch_viz when tracking is enabled (#10846) 2025-06-17 19:38:02 +03:00
Bhavya Gada
3a474ef5b7 move bitwise_and/bitwise_or/bitwise_xor to MathTrait [pr] (#10794)
* move bitwise and, or, xor to MathTrait

* refactor
2025-06-17 09:19:43 -07:00
George Hotz
531d143780 bring back old sharded rand behavior (#10842) 2025-06-16 17:23:47 -07:00
George Hotz
a493eb396c fix view add 0 (#10840) 2025-06-16 16:46:12 -07:00
George Hotz
b5ce227850 Revert "hotfix: remove setrecursionlimit"
This reverts commit acfc81642a.
2025-06-16 16:01:42 -07:00
George Hotz
acfc81642a hotfix: remove setrecursionlimit 2025-06-16 15:31:34 -07:00
George Hotz
00c46e7077 print rules count in match stats (#10839) 2025-06-16 14:56:27 -07:00
George Hotz
e2907360b7 multi is one PM [pr] (#10838)
* multi is one PM [pr]

* disable flaky tests
2025-06-16 14:52:47 -07:00
Sieds Lykles
b1fefb76dd More conditions for (x//c1+a)//c2 -> (x+a*c1)//(c1*c2) (#10834)
* add rule and test

* typo

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-06-16 16:34:52 -04:00
uuuvn
18d936f981 Remote multihost (#10598) 2025-06-16 13:18:56 -07:00
George Hotz
0629e45332 remove cpu graph (#10836)
* remove cpu graph, it's different from the others

* remote was blacklisting CPUGraph

* remove cpugraph from dsp
2025-06-16 11:40:58 -07:00
Sieds Lykles
deb6af0638 Remove incorrect rule for x%-d -> (x%d)*-1 (#10832)
* fix rule and add test

* combine tests
2025-06-16 11:37:44 -04:00
Sieds Lykles
946243dbb2 Change z3 cdiv to euclidian division (#10833)
* change z3_cdiv

* shorter
2025-06-16 11:04:51 -04:00
qazal
2c6fd5bf81 viz: rename to svgZoom (#10831) 2025-06-16 13:05:22 +03:00
qazal
e8ec3f544b viz: helper for keeping state in browser history (#10830) 2025-06-16 12:26:10 +03:00
chenyu
e5d5ae55f9 smaller inputs for test_sort and test_topk (#10829) 2025-06-16 00:21:15 -04:00
nimlgen
c0329148c7 am: check va is aligned to page size (#10815)
* am: check va is aligned to page size

* swap them

* is this faster
2025-06-15 22:51:09 +03:00
Sieds Lykles
ac27c46104 fix UPat get_location after mathtraits refactor (#10814)
* fix UPat get_location

* fold line
2025-06-15 12:47:55 -07:00
George Hotz
5dc1bc6070 switch get_kernel -> get_program [pr] (#10817)
* switch get_kernel -> get_program [pr]

* fix tests
2025-06-15 12:26:50 -07:00
George Hotz
a36b09a715 universal device import [pr] (#10818) 2025-06-15 12:01:02 -07:00
George Hotz
cc5e4e54b8 move type verify to codegen [pr] (#10816) 2025-06-15 12:00:52 -07:00
George Hotz
27cf836958 split ocelot out for autogen, fix CI (#10819)
* split ocelot out for autogen, fix CI

* mac ocelot
2025-06-15 11:37:23 -07:00
Ahmed Harmouche
c380efc220 Support aarch64 linux on webgpu (#10802) 2025-06-14 14:57:18 -04:00
Sieds Lykles
37d3ca152e Adapt >> for division by power of two to all ints (#10803)
* Change divison by power of two to always use shift

* Change test to test int instead of uint

* simplify condition

* add old rule back with comment

* remove import

* use sresolve instead of simplify

* use keyword in simplify instead of sresolve

* webgpu cast y to uint

* remove comment

* explicitly set dtype in wgsl

* without simplify

* undo simplify kwarg

* change test to test both int32 and uint32
2025-06-14 14:55:51 -04:00
chenyu
652db5702b move test_conv_shapetracker and some test_search util into unit test (#10812) 2025-06-14 13:29:32 -04:00
George Hotz
754667093f remove IGNORE stuff (#10796)
* remove IGNORE stuff, was this even tested? [pr]

* delete IGNORE op
2025-06-14 09:59:45 -07:00
leopf
118a09ddcf xor self folding (#10806)
* xor folding

* tests + z3 bitwise xor
2025-06-14 10:01:17 -04:00
qazal
8e6ac18436 viz: make sidebar list responsive to keyboard smashing (#10811)
* expanded is a static style

* only draw the list once

* identify with ids

* state isn't used here anymore

* only toggle states

* less
2025-06-14 13:52:27 +03:00
chenyu
8c28b5d833 move dtype spec tests into unit test (#10808)
* move dtype spec tests into unit test

can clean up more after the split

* skip CI test_backward_sum_acc_dtype
2025-06-13 22:21:22 -04:00
chenyu
7a6df0a161 remove .relu() call in several conv tests in test_ops (#10807)
* remove .relu() call in several conv tests in test_ops

testing negative parts double the effectiveness. keep the relu between two convs and the tests that explicitly test relu

* relax tol
2025-06-13 17:10:16 -04:00
nimlgen
b6e574fcdf am: smu 14.0.3 is smu 14.0.2 (#10714) 2025-06-13 23:07:56 +03:00
chenyu
7d5c769c6b fix compile4 (#10797) 2025-06-12 22:28:56 -04:00
wozeparrot
c01b20fd83 amd: more verbose out of memory error (#10798) 2025-06-12 19:06:58 -07:00
geohotstan
806b68c2b3 Add fallback dtype to ONNX (#10788)
* start

* still need the float16 workaround in

* tiny nit for correctness

* idk hacks, I need to understand this device stuff better

* no-op?

* remove that assert for true nooooooop

* add fallback_context
2025-06-12 20:39:21 -04:00
George Hotz
dcd1928f29 tensor cores for gfx1200 [pr] (#10795) 2025-06-12 16:33:29 -07:00
qazal
a113c5e3ae viz: update browser test to properly shutdown [pr] (#10793)
Using `await page.evaluate` can cause non deterministic `TargetCloseError`
exceptions if it cannot find the elements on the page, Puppeteer
doesn't cleanly stop when `browser.close()` is called.
[Failing CI](https://github.com/tinygrad/tinygrad/actions/runs/15596803685/job/43928961323?pr=10763#step:9:61)
2025-06-12 17:58:42 +03:00
Dan German
24e7aed74b ramp.py: correct UOp and Ops import path from tinygrad.uop to tinygrad.uop.ops (#10791) 2025-06-12 10:07:03 -04:00
qazal
c066baea65 viz: enter key only expands steps (#10792)
It shouldn't be changing any step or context state. Those are handled
explicitly by the arrow keys (or clicking).
2025-06-12 16:00:14 +03:00
qazal
822e2dcb20 viz: back button returns to the kernel graph (#10790)
* create space

* viz: back button returns to the kernel graph
2025-06-12 15:19:48 +03:00
chenyu
4242b9874e remove AMD_LLVM=0 in mlperf and search ci (#10785)
tinybox updated to llvm 20
2025-06-11 21:10:31 -04:00
wozeparrot
eb739bb96a hotfix: lower threshold (#10786) 2025-06-11 19:36:20 -04:00
wozeparrot
53edd49a33 feat: bump to llvm20 (#10784) 2025-06-11 16:04:18 -07:00
chenyu
7d8939908f AMD_LLVM=0 for resnet cron (#10780)
similar pf on llvm19 and fine on 20
2025-06-11 16:28:40 -04:00
qazal
a6af8db4d3 viz work from the profiler (#10781)
* inline ansistrip

* refactor to changeStep + explicitly set expandSteps
2025-06-11 23:20:41 +03:00
Sieds Lykles
10b61157b9 Support symbolic slice with no start [pr] (#10775)
* add symbolic slice with no start

* reshape the test

* step must be int

* just add a cast...

* more cast...
2025-06-11 16:00:38 -04:00
chenyu
d465ef4acb AMD_LLVM=0 for sdxl search (#10779)
hangs with llvm19 but seems fine with llvm20
2025-06-11 14:56:45 -04:00