Commit Graph

9222 Commits

Author SHA1 Message Date
George Hotz
7636d2cdc5 flip order of get_program args (#10905) 2025-06-20 17:23:23 -07:00
George Hotz
1ce63f8d04 move functions to view and update docs [pr] (#10904)
* move functions to view and update docs [pr]

* move quantize
2025-06-20 16:47:58 -07:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
d399a4587d move mem estimate to ProgramSpec [pr] (#10901) 2025-06-20 15:54:28 -07:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
nimlgen
bb0299b9e5 system: shared pci logic (#10894)
* moveout pci logic

* fixes

* oops

* types

* more type

* one style

* thi is imp
2025-06-21 00:09:49 +03:00
nimlgen
c83fdc50d1 nv: driver iface (#10895)
* nv: driver iface

* fixes

* ops

* not used anymore

* fix mypy

* too long

* fix

* fixed

* mypy

* ugh, it's misc

* rename to NVK
2025-06-20 22:36:08 +03:00
George Hotz
fc9f883870 if upat returns self, it's none (#10898)
* if upat returns self, it's none

* fix pm tests
2025-06-20 12:11:19 -07:00
qazal
4f179b9ddb viz: gate launch behind a ContextVar [pr] (#10892) 2025-06-20 17:30:32 +03:00
chenyu
3f29c7edda minor onnx dropout cleanup (#10891)
we should consider removing numpy random and test it similar to test_randomness, unless how seed works is part of spec?
2025-06-20 10:18:34 -04:00
simone-pietro
e94ac6e20c Cast ptr to int in test_from_mv_to_mv (#10876)
* Cast ptr to int in test_from_mv_to_mv

* Add type hints for from_mv
2025-06-20 14:52:34 +03:00
qazal
000eb30f04 viz: remove prev profiler file (#10888)
The new profiler is integrated in the main VIZ tab.

Will also delete perfetto.html after matching [final features](https://github.com/tinygrad/tinygrad/pull/10763#issuecomment-2980543715) soon.
2025-06-19 23:05:46 +03:00
chenyu
62a540066e remove DEBUG=2 in mi300x bert setup (#10886)
seems fine now, not sure what the issue was
2025-06-19 13:28:53 -04:00
Nino Risteski
5a56710ff4 small fix replacing download_file with fetch (#10877)
* imported a missing os and replaced download_file with fetch from tg helpers

* use fetch directly

* Remove if not os.path.isfile
2025-06-19 12:12:09 -04:00
chenyu
8d721a4ead add 405B params to llama3.py (#10884)
tested with `python examples/llama3.py --model /raid/weights/llama31_405b/ --size 405B --shard 8 --benchmark` on tinyamd2
2025-06-19 11:45:37 -04:00
chenyu
a3dae51085 lower test_gemm_8192 on red (#10883) 2025-06-19 10:01:25 -04:00
simone-pietro
36f01411a2 Pass list to block_reorder in test_loads (#10881) 2025-06-19 09:49:45 -04:00
chenyu
f377cc19cd use AM for bert (#10882)
have triained 3 runs and all seem fine
2025-06-19 09:48:54 -04:00
borgwang
06ea74bf2c fix-typos (#10879) 2025-06-19 09:13:31 -04:00
qazal
ac891b78f8 skip UOp del when python is shutting down [pr] (#10847) 2025-06-19 15:31:40 +03:00
simone-pietro
58252e3c49 Change type hint for init_c_struct_t and to_struct [pr] (#10878)
* Change type hint for init_c_struct_t

* Change type hint for to_struct
2025-06-19 13:22:44 +03:00
qazal
00d0071b36 simpler viz naming [pr] (#10874)
* simpler viz naming [pr]

* n2
2025-06-19 12:10:47 +03:00
qazal
5839542fc8 viz: one name arg in track_rewrites [pr] (#10873)
* viz: one name arg in track_rewrites [pr]

* other test
2025-06-19 03:34:56 +03:00
George Hotz
18593c9800 one less rewrite on schedule [pr] (#10872)
* one less rewrite on schedule [pr]

* verify in ebs
2025-06-18 17:06:17 -07:00
uuuvn
e7a26211d2 Queue remote transfers on source (#10871)
https://github.com/tinygrad/tinygrad/pull/10601#issuecomment-2985624147

I personally don't see how that is a good standalone pr, but whatever
2025-06-18 16:08:44 -07:00
uuuvn
a9f3632c4f SessionKey is a dataclass (#10870) 2025-06-18 15:07:31 -07:00
wozeparrot
bdbf121285 fix: contigous -> contiguous (#10868) 2025-06-18 13:09:51 -07:00
qazal
344a220b87 s/lb_refcount/uop_refcount [pr] (#10865) 2025-06-18 21:48:04 +03:00
simone-pietro
f59df04998 Generalize type hint for get_single_element [pr] (#10866)
* Generalize type hint for get_single_element

* Improve wording in assert
2025-06-18 13:13:04 -04:00
chenyu
d71bb6a7b2 remove comma 0.9.4 from benchmark (#10867) 2025-06-18 12:43:59 -04:00
chenyu
b70c7d3631 bert grad accumulation (#10863)
* bert grad accumulation

* realize grad
2025-06-18 12:17:07 -04:00
simone-pietro
56fe5b60a9 Cast int to str for render_cast (#10864)
* Add type hint for render_cast

* Revert "Add type hint for render_cast"

This reverts commit 33858eb711.

* Cast int to str for render_cast
2025-06-18 10:55:27 -04:00
simone-pietro
d8cea1a279 Change rate to int in test_tqdm (#10848) 2025-06-18 08:40:43 -04:00
simone-pietro
0735224ac2 Pass PythonRenderer instance to full_rewrite (#10859) 2025-06-18 08:39:27 -04:00
qazal
96509daaba enable copy folding tests [pr] (#10862) 2025-06-18 13:05:35 +03:00
qazal
84d568d0cc do not import grouper internals in test_schedule [pr] (#10861)
* fix import

* fix test_multitoutput_ast

* fix test_recursive_swizzle

* test_alu_after_copy

* remove that test
2025-06-18 12:47:57 +03:00
qazal
8b879b0314 merge TestTensorUOpSpec with the other spec unittests [pr] (#10860)
* merge TestTensorUOpSpec with the other spec unittests [pr]

* rename to test_uop_spec
2025-06-18 12:12:08 +03:00
qazal
a5f2bb614a remove validate_kernel, it is asserting implementation details [pr] (#10858) 2025-06-18 11:42:36 +03:00
George Hotz
cba6e15937 split grouper and kernelize [pr] (#10854) 2025-06-17 17:54:20 -07:00
George Hotz
75503955bf simple schedule test [pr] (#10853) 2025-06-17 16:19:27 -07:00
chenyu
075a74cf25 add global_batch_size to mlperf bert (#10852)
global_batch_size = grad_acc_steps * batch_size. no-op change to prep grad acc for bert
2025-06-17 17:54:15 -04:00
uuuvn
a51f18f8f9 CI flakiness (#10851)
https://github.com/tinygrad/tinygrad/actions/runs/15718103629/job/44292845140?pr=10753#step:4:161
2025-06-17 14:46:30 -07:00
qazal
e77cd81662 time viz (#10763)
* work

* basic stuff

* work

* also reset

* moving through time

* cleanup

* proper zoom

* add livereload.js

pip install livereload
livereload tinygrad/viz

* minor

* fixed width, remove viewbox

* bit of flexbox magic

* show pid/tid

* merge loops

* min-height

* redo some layout stuff

* create cell groups

* text is hard

* javascript Math.min causes "Maximum call stack size"
bert repro: VIZ=1 PYTHONPATH=. DEFAULT_FLOAT=HALF BS=66 GPUS=6 BERT_LAYERS=2 FUSE_ARANGE=1 MODEL=bert python3 examples/mlperf/model_train.py

* fix recursion issue

* no viz/server changes

* fix test_viz

* everything is a g

* text is easy

* no it's still hard

* livereload+notes

* height: 100% fixes the device bug

* start canvas work

* base canvas

* take chrome's stuff

* serve chrome's thing

* fetch traces from get_profile

* remove junk

* remove some more

* bring everything back again

* dispatch resize events

* base ticks

* hook d3.zoom

* zoom on the x axis

* bring filter back, makes ctrl+drag possible

* remove junk

* Revert "remove junk"

This reverts commit 4987e7bec1.

* draws something, the zooms aren't right

* move to canvas

* fix zooming

* Revert "Revert "remove junk""

This reverts commit 5aac2034fb.

* space key resets zoom

* Divide timelines by device on y axis

* Show kernel names when the width allows

* Clicking on kernel opens it in the kernel graph

* remove livereload.js

* reset diff

* base diff:

- fetch traceEvents
- displayGraph
- flexbox layout
- rest of canvas

* rescale in-place is faster, d3's rescaleX creates a copy

* less

* aesthetics

* map names

* first viz is profiler

* this will work when i make canvas once

* initial cleanups

* factor out of loop

* refactor + only show devices with events

* properly align program rects

* cleaner tick lines

* padding

* listen for resize

* simple zoom

* space more

* i always end up making zoom globl

* how is this ever allowed

* clicking works again

* back button goes back to the same zoom level

* work

* more work

* bring that back

* coloring work and simplify

* black

* keep perfetto button for comparison

* better

* ph===X

* simplify history stuff

* temp: handcoded

* test: flamegraph style leveling

* Revert "temp: handcoded"

This reverts commit bdcd538e88.

* disable flamegraph

* group by pid

* factor y and height out of render

* now flamegraph is easy

* livereload stuff

* remove that

* less
2025-06-17 19:39:34 +03:00
qazal
9e2cb7522a viz: define launch_viz when tracking is enabled (#10846) 2025-06-17 19:38:02 +03:00
Bhavya Gada
3a474ef5b7 move bitwise_and/bitwise_or/bitwise_xor to MathTrait [pr] (#10794)
* move bitwise and, or, xor to MathTrait

* refactor
2025-06-17 09:19:43 -07:00
George Hotz
531d143780 bring back old sharded rand behavior (#10842) 2025-06-16 17:23:47 -07:00
George Hotz
a493eb396c fix view add 0 (#10840) 2025-06-16 16:46:12 -07:00
George Hotz
b5ce227850 Revert "hotfix: remove setrecursionlimit"
This reverts commit acfc81642a.
2025-06-16 16:01:42 -07:00
George Hotz
acfc81642a hotfix: remove setrecursionlimit 2025-06-16 15:31:34 -07:00
George Hotz
00c46e7077 print rules count in match stats (#10839) 2025-06-16 14:56:27 -07:00