Commit Graph

10490 Commits

Author SHA1 Message Date
qazal
7820aeca8e update codegen process replay to use get_program [pr] (#10921)
* update codegen process replay to get_program [pr]

* precommit

* try str replace

* +to_function_name

* fixup tc

* local2.sh

* fix openpilot NOLOCALS

* new local.sh

* correct merge

* beam cache

* back

* revert beam thing

* adding opts_override and name_override makes output of get_program
reproducible

* min diff
2025-06-23 17:31:41 +03:00
nimlgen
eceb7a00d2 nv: rename iface mem functions (#10931) 2025-06-23 16:34:51 +03:00
qazal
4e864bd304 fix: getenv("NOLOCALS")/NOLOCALS context var (#10927)
OptOps shouldn't rely on os.environ.
2025-06-23 11:23:59 +03:00
alpharush
22f9696522 Fix/hcqfuzz harnesss bug (#10923)
* update command so extra module is found

* fix empty range in randrange errors

* lint
2025-06-23 11:22:30 +03:00
qazal
f037f85532 s/getenv("TC")/USE_TC context var (#10922) 2025-06-23 00:39:45 +03:00
qazal
9201224e0b viz: remove Kernel check [pr] (#10920)
* viz: remove Kernel check [pr]

* TestVizIntegration

* test/unit allows opening of devices

* kernel -> Kernel
2025-06-22 20:47:54 +03:00
nimlgen
3ccdb2356b system: factor out PCIIfaceBase (#10917)
* system: factor out PCIIfaceBase

* linter

* typing
2025-06-22 20:03:14 +03:00
George Hotz
b09c47366f opt transforms the ast into an optimized ast (#10900)
* opt transforms the ast into an optimized ast

* fix get_kernel order and to_function_name

* function_name property

* update docs

* copy from kernel.py

* improve docs

* ci didn't trigger?
2025-06-22 09:41:26 -07:00
qazal
ffddf165f8 viz: color by kernel names in profiler (#10919)
* viz: color by kernel names in profiler

* ellipsis stays in bounds
2025-06-22 18:07:52 +03:00
nimlgen
36536ef6f0 nv: minor changes from nvpci (#10918) 2025-06-22 18:04:39 +03:00
geohotstan
4ab7d792cc ONNX improve dtype fallback (#10800)
* fix

* add early verbose demo test

* is this how to write tests :s

* is definition drift even a thing? gemini says it is

* clean up

* better

* even better

* try add to CI

* doesn't work quite yet

* much more work to be done

* whoops

* partition the test heh

* skipif

* some nits for better names

* add webgpu test for onnxrunner

* fix reference links

* flush for now
2025-06-21 19:29:45 -04:00
chenyu
0480139def log_perplexity metrics (#10912) 2025-06-21 10:44:47 -04:00
nimlgen
0e7bd9fd03 factor out generic MemoryManager (#10910)
* allocator -> memory

* just moveout it

* mm is abstracted

* need entry abstraction

* fix

* mypy
2025-06-21 16:18:33 +03:00
qazal
c7ec913210 viz: cleanup unit tests (#10909)
* cleanup test_viz

* tree view
2025-06-21 12:35:09 +03:00
chenyu
1373071f19 simplify logcumsumexp (#10908)
clarify and remove some flatten and squeeze/unsqueeze
2025-06-20 22:56:42 -04:00
George Hotz
fa52bdb50f applied_opts is in the optimized ast] (#10906) 2025-06-20 18:56:23 -07:00
chenyu
2d9c61e39e test more dims in test_logsumexp and test_logcumsumexp (#10907)
refactoring squeeze and unsqueeze is easy to get wrong
2025-06-20 21:42:18 -04:00
Nino Risteski
3771cc0f77 fix test logcumsumexp broken devectorize=0 (#10880)
* fix test logcumsumexp numerical

* lint

* Use dtypes.min instead of -1e4
2025-06-20 20:54:50 -04:00
George Hotz
7636d2cdc5 flip order of get_program args (#10905) 2025-06-20 17:23:23 -07:00
George Hotz
1ce63f8d04 move functions to view and update docs [pr] (#10904)
* move functions to view and update docs [pr]

* move quantize
2025-06-20 16:47:58 -07:00
George Hotz
b41e0563a3 move stuff to kernelize folder (#10902)
* move stuff to kernelize folder

* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
d399a4587d move mem estimate to ProgramSpec [pr] (#10901) 2025-06-20 15:54:28 -07:00
George Hotz
92678e59ee move kernel to opt (#10899) 2025-06-20 15:22:28 -07:00
nimlgen
bb0299b9e5 system: shared pci logic (#10894)
* moveout pci logic

* fixes

* oops

* types

* more type

* one style

* thi is imp
2025-06-21 00:09:49 +03:00
nimlgen
c83fdc50d1 nv: driver iface (#10895)
* nv: driver iface

* fixes

* ops

* not used anymore

* fix mypy

* too long

* fix

* fixed

* mypy

* ugh, it's misc

* rename to NVK
2025-06-20 22:36:08 +03:00
George Hotz
fc9f883870 if upat returns self, it's none (#10898)
* if upat returns self, it's none

* fix pm tests
2025-06-20 12:11:19 -07:00
qazal
4f179b9ddb viz: gate launch behind a ContextVar [pr] (#10892) 2025-06-20 17:30:32 +03:00
chenyu
3f29c7edda minor onnx dropout cleanup (#10891)
we should consider removing numpy random and test it similar to test_randomness, unless how seed works is part of spec?
2025-06-20 10:18:34 -04:00
simone-pietro
e94ac6e20c Cast ptr to int in test_from_mv_to_mv (#10876)
* Cast ptr to int in test_from_mv_to_mv

* Add type hints for from_mv
2025-06-20 14:52:34 +03:00
qazal
000eb30f04 viz: remove prev profiler file (#10888)
The new profiler is integrated in the main VIZ tab.

Will also delete perfetto.html after matching [final features](https://github.com/tinygrad/tinygrad/pull/10763#issuecomment-2980543715) soon.
2025-06-19 23:05:46 +03:00
chenyu
62a540066e remove DEBUG=2 in mi300x bert setup (#10886)
seems fine now, not sure what the issue was
2025-06-19 13:28:53 -04:00
Nino Risteski
5a56710ff4 small fix replacing download_file with fetch (#10877)
* imported a missing os and replaced download_file with fetch from tg helpers

* use fetch directly

* Remove if not os.path.isfile
2025-06-19 12:12:09 -04:00
chenyu
8d721a4ead add 405B params to llama3.py (#10884)
tested with `python examples/llama3.py --model /raid/weights/llama31_405b/ --size 405B --shard 8 --benchmark` on tinyamd2
2025-06-19 11:45:37 -04:00
chenyu
a3dae51085 lower test_gemm_8192 on red (#10883) 2025-06-19 10:01:25 -04:00
simone-pietro
36f01411a2 Pass list to block_reorder in test_loads (#10881) 2025-06-19 09:49:45 -04:00
chenyu
f377cc19cd use AM for bert (#10882)
have triained 3 runs and all seem fine
2025-06-19 09:48:54 -04:00
borgwang
06ea74bf2c fix-typos (#10879) 2025-06-19 09:13:31 -04:00
qazal
ac891b78f8 skip UOp del when python is shutting down [pr] (#10847) 2025-06-19 15:31:40 +03:00
simone-pietro
58252e3c49 Change type hint for init_c_struct_t and to_struct [pr] (#10878)
* Change type hint for init_c_struct_t

* Change type hint for to_struct
2025-06-19 13:22:44 +03:00
qazal
00d0071b36 simpler viz naming [pr] (#10874)
* simpler viz naming [pr]

* n2
2025-06-19 12:10:47 +03:00
qazal
5839542fc8 viz: one name arg in track_rewrites [pr] (#10873)
* viz: one name arg in track_rewrites [pr]

* other test
2025-06-19 03:34:56 +03:00
George Hotz
18593c9800 one less rewrite on schedule [pr] (#10872)
* one less rewrite on schedule [pr]

* verify in ebs
2025-06-18 17:06:17 -07:00
uuuvn
e7a26211d2 Queue remote transfers on source (#10871)
https://github.com/tinygrad/tinygrad/pull/10601#issuecomment-2985624147

I personally don't see how that is a good standalone pr, but whatever
2025-06-18 16:08:44 -07:00
uuuvn
a9f3632c4f SessionKey is a dataclass (#10870) 2025-06-18 15:07:31 -07:00
wozeparrot
bdbf121285 fix: contigous -> contiguous (#10868) 2025-06-18 13:09:51 -07:00
qazal
344a220b87 s/lb_refcount/uop_refcount [pr] (#10865) 2025-06-18 21:48:04 +03:00
simone-pietro
f59df04998 Generalize type hint for get_single_element [pr] (#10866)
* Generalize type hint for get_single_element

* Improve wording in assert
2025-06-18 13:13:04 -04:00
chenyu
d71bb6a7b2 remove comma 0.9.4 from benchmark (#10867) 2025-06-18 12:43:59 -04:00
chenyu
b70c7d3631 bert grad accumulation (#10863)
* bert grad accumulation

* realize grad
2025-06-18 12:17:07 -04:00
simone-pietro
56fe5b60a9 Cast int to str for render_cast (#10864)
* Add type hint for render_cast

* Revert "Add type hint for render_cast"

This reverts commit 33858eb711.

* Cast int to str for render_cast
2025-06-18 10:55:27 -04:00