qazal
7820aeca8e
update codegen process replay to use get_program [pr] ( #10921 )
...
* update codegen process replay to get_program [pr]
* precommit
* try str replace
* +to_function_name
* fixup tc
* local2.sh
* fix openpilot NOLOCALS
* new local.sh
* correct merge
* beam cache
* back
* revert beam thing
* adding opts_override and name_override makes output of get_program
reproducible
* min diff
2025-06-23 17:31:41 +03:00
nimlgen
eceb7a00d2
nv: rename iface mem functions ( #10931 )
2025-06-23 16:34:51 +03:00
qazal
4e864bd304
fix: getenv("NOLOCALS")/NOLOCALS context var ( #10927 )
...
OptOps shouldn't rely on os.environ.
2025-06-23 11:23:59 +03:00
alpharush
22f9696522
Fix/hcqfuzz harnesss bug ( #10923 )
...
* update command so extra module is found
* fix empty range in randrange errors
* lint
2025-06-23 11:22:30 +03:00
qazal
f037f85532
s/getenv("TC")/USE_TC context var ( #10922 )
2025-06-23 00:39:45 +03:00
qazal
9201224e0b
viz: remove Kernel check [pr] ( #10920 )
...
* viz: remove Kernel check [pr]
* TestVizIntegration
* test/unit allows opening of devices
* kernel -> Kernel
2025-06-22 20:47:54 +03:00
nimlgen
3ccdb2356b
system: factor out PCIIfaceBase ( #10917 )
...
* system: factor out PCIIfaceBase
* linter
* typing
2025-06-22 20:03:14 +03:00
George Hotz
b09c47366f
opt transforms the ast into an optimized ast ( #10900 )
...
* opt transforms the ast into an optimized ast
* fix get_kernel order and to_function_name
* function_name property
* update docs
* copy from kernel.py
* improve docs
* ci didn't trigger?
2025-06-22 09:41:26 -07:00
qazal
ffddf165f8
viz: color by kernel names in profiler ( #10919 )
...
* viz: color by kernel names in profiler
* ellipsis stays in bounds
2025-06-22 18:07:52 +03:00
nimlgen
36536ef6f0
nv: minor changes from nvpci ( #10918 )
2025-06-22 18:04:39 +03:00
geohotstan
4ab7d792cc
ONNX improve dtype fallback ( #10800 )
...
* fix
* add early verbose demo test
* is this how to write tests :s
* is definition drift even a thing? gemini says it is
* clean up
* better
* even better
* try add to CI
* doesn't work quite yet
* much more work to be done
* whoops
* partition the test heh
* skipif
* some nits for better names
* add webgpu test for onnxrunner
* fix reference links
* flush for now
2025-06-21 19:29:45 -04:00
chenyu
0480139def
log_perplexity metrics ( #10912 )
2025-06-21 10:44:47 -04:00
nimlgen
0e7bd9fd03
factor out generic MemoryManager ( #10910 )
...
* allocator -> memory
* just moveout it
* mm is abstracted
* need entry abstraction
* fix
* mypy
2025-06-21 16:18:33 +03:00
qazal
c7ec913210
viz: cleanup unit tests ( #10909 )
...
* cleanup test_viz
* tree view
2025-06-21 12:35:09 +03:00
chenyu
1373071f19
simplify logcumsumexp ( #10908 )
...
clarify and remove some flatten and squeeze/unsqueeze
2025-06-20 22:56:42 -04:00
George Hotz
fa52bdb50f
applied_opts is in the optimized ast] ( #10906 )
2025-06-20 18:56:23 -07:00
chenyu
2d9c61e39e
test more dims in test_logsumexp and test_logcumsumexp ( #10907 )
...
refactoring squeeze and unsqueeze is easy to get wrong
2025-06-20 21:42:18 -04:00
Nino Risteski
3771cc0f77
fix test logcumsumexp broken devectorize=0 ( #10880 )
...
* fix test logcumsumexp numerical
* lint
* Use dtypes.min instead of -1e4
2025-06-20 20:54:50 -04:00
George Hotz
7636d2cdc5
flip order of get_program args ( #10905 )
2025-06-20 17:23:23 -07:00
George Hotz
1ce63f8d04
move functions to view and update docs [pr] ( #10904 )
...
* move functions to view and update docs [pr]
* move quantize
2025-06-20 16:47:58 -07:00
George Hotz
b41e0563a3
move stuff to kernelize folder ( #10902 )
...
* move stuff to kernelize folder
* oops, forgot that
2025-06-20 16:10:20 -07:00
George Hotz
d399a4587d
move mem estimate to ProgramSpec [pr] ( #10901 )
2025-06-20 15:54:28 -07:00
George Hotz
92678e59ee
move kernel to opt ( #10899 )
2025-06-20 15:22:28 -07:00
nimlgen
bb0299b9e5
system: shared pci logic ( #10894 )
...
* moveout pci logic
* fixes
* oops
* types
* more type
* one style
* thi is imp
2025-06-21 00:09:49 +03:00
nimlgen
c83fdc50d1
nv: driver iface ( #10895 )
...
* nv: driver iface
* fixes
* ops
* not used anymore
* fix mypy
* too long
* fix
* fixed
* mypy
* ugh, it's misc
* rename to NVK
2025-06-20 22:36:08 +03:00
George Hotz
fc9f883870
if upat returns self, it's none ( #10898 )
...
* if upat returns self, it's none
* fix pm tests
2025-06-20 12:11:19 -07:00
qazal
4f179b9ddb
viz: gate launch behind a ContextVar [pr] ( #10892 )
2025-06-20 17:30:32 +03:00
chenyu
3f29c7edda
minor onnx dropout cleanup ( #10891 )
...
we should consider removing numpy random and test it similar to test_randomness, unless how seed works is part of spec?
2025-06-20 10:18:34 -04:00
simone-pietro
e94ac6e20c
Cast ptr to int in test_from_mv_to_mv ( #10876 )
...
* Cast ptr to int in test_from_mv_to_mv
* Add type hints for from_mv
2025-06-20 14:52:34 +03:00
qazal
000eb30f04
viz: remove prev profiler file ( #10888 )
...
The new profiler is integrated in the main VIZ tab.
Will also delete perfetto.html after matching [final features](https://github.com/tinygrad/tinygrad/pull/10763#issuecomment-2980543715 ) soon.
2025-06-19 23:05:46 +03:00
chenyu
62a540066e
remove DEBUG=2 in mi300x bert setup ( #10886 )
...
seems fine now, not sure what the issue was
2025-06-19 13:28:53 -04:00
Nino Risteski
5a56710ff4
small fix replacing download_file with fetch ( #10877 )
...
* imported a missing os and replaced download_file with fetch from tg helpers
* use fetch directly
* Remove if not os.path.isfile
2025-06-19 12:12:09 -04:00
chenyu
8d721a4ead
add 405B params to llama3.py ( #10884 )
...
tested with `python examples/llama3.py --model /raid/weights/llama31_405b/ --size 405B --shard 8 --benchmark` on tinyamd2
2025-06-19 11:45:37 -04:00
chenyu
a3dae51085
lower test_gemm_8192 on red ( #10883 )
2025-06-19 10:01:25 -04:00
simone-pietro
36f01411a2
Pass list to block_reorder in test_loads ( #10881 )
2025-06-19 09:49:45 -04:00
chenyu
f377cc19cd
use AM for bert ( #10882 )
...
have triained 3 runs and all seem fine
2025-06-19 09:48:54 -04:00
borgwang
06ea74bf2c
fix-typos ( #10879 )
2025-06-19 09:13:31 -04:00
qazal
ac891b78f8
skip UOp del when python is shutting down [pr] ( #10847 )
2025-06-19 15:31:40 +03:00
simone-pietro
58252e3c49
Change type hint for init_c_struct_t and to_struct [pr] ( #10878 )
...
* Change type hint for init_c_struct_t
* Change type hint for to_struct
2025-06-19 13:22:44 +03:00
qazal
00d0071b36
simpler viz naming [pr] ( #10874 )
...
* simpler viz naming [pr]
* n2
2025-06-19 12:10:47 +03:00
qazal
5839542fc8
viz: one name arg in track_rewrites [pr] ( #10873 )
...
* viz: one name arg in track_rewrites [pr]
* other test
2025-06-19 03:34:56 +03:00
George Hotz
18593c9800
one less rewrite on schedule [pr] ( #10872 )
...
* one less rewrite on schedule [pr]
* verify in ebs
2025-06-18 17:06:17 -07:00
uuuvn
e7a26211d2
Queue remote transfers on source ( #10871 )
...
https://github.com/tinygrad/tinygrad/pull/10601#issuecomment-2985624147
I personally don't see how that is a good standalone pr, but whatever
2025-06-18 16:08:44 -07:00
uuuvn
a9f3632c4f
SessionKey is a dataclass ( #10870 )
2025-06-18 15:07:31 -07:00
wozeparrot
bdbf121285
fix: contigous -> contiguous ( #10868 )
2025-06-18 13:09:51 -07:00
qazal
344a220b87
s/lb_refcount/uop_refcount [pr] ( #10865 )
2025-06-18 21:48:04 +03:00
simone-pietro
f59df04998
Generalize type hint for get_single_element [pr] ( #10866 )
...
* Generalize type hint for get_single_element
* Improve wording in assert
2025-06-18 13:13:04 -04:00
chenyu
d71bb6a7b2
remove comma 0.9.4 from benchmark ( #10867 )
2025-06-18 12:43:59 -04:00
chenyu
b70c7d3631
bert grad accumulation ( #10863 )
...
* bert grad accumulation
* realize grad
2025-06-18 12:17:07 -04:00
simone-pietro
56fe5b60a9
Cast int to str for render_cast ( #10864 )
...
* Add type hint for render_cast
* Revert "Add type hint for render_cast"
This reverts commit 33858eb711 .
* Cast int to str for render_cast
2025-06-18 10:55:27 -04:00